Linux HPC/Nvidia Admin from Boston
Ashburn, VA 20147
About the Job
Linux Administrator
Ashburn, VA, Onsite
Contract 12+ months
About Our Client
Our client is a leading technology company specializing in high-performance computing solutions.
With a focus on cutting-edge GPU-based architectures and containerized applications, they are at the forefront of innovation in the HPC industry. The company values technical excellence, continuous learning, and collaborative problem-solving. Their mission is to provide state-of-the-art computing solutions that empower businesses and researchers to tackle the most complex computational challenges.
Job Description
We are seeking a highly skilled and experienced Linux Administrator to join our client's IT team.
In this role, you will be responsible for managing and maintaining Linux server infrastructure, with a focus on GPU-based HPC systems and NVIDIA architectures.
You will ensure optimal performance, security, and reliability of these advanced computing environments. This position offers an exciting opportunity to work with cutting-edge technologies and contribute to groundbreaking projects in the field of high-performance computing.
The ideal candidate is passionate about Linux systems administration, has hands-on experience with GPU-based HPC environments, and is comfortable working in a dynamic, on-site environment.
You will play a crucial role in supporting and optimizing the client's advanced computing infrastructure.
Duties and Responsibilities
- Manage and maintain Linux-based servers and HPC systems, particularly those based on NVIDIA DGX architectures
- Install, configure, and optimize GPU-based HPC environments
- Troubleshoot complex issues related to Linux systems, including hardware failures, network problems, and software conflicts
- Implement and manage security policies for HPC environments
- Develop and maintain scripts and automation tools for system monitoring, backups, and deployments
- Monitor system performance and conduct performance tuning for optimal efficiency
- Design and implement disaster recovery plans and backup strategies
- Create and maintain comprehensive documentation for system configurations and procedures
- Collaborate with other IT team members, developers, and stakeholders on projects
- Manage system upgrades and patches, ensuring minimal disruption to operations
- Work with containerized applications and optimize their performance in HPC environments
Required Experience/Skills
- Minimum of 5 years of experience as a Linux Administrator or in a similar role
- Extensive hands-on experience managing large-scale Linux-based environments
- Strong knowledge of GPU-based HPC architectures, particularly NVIDIA technologies
- Experience working with containerized applications in HPC environments
- Recent hands-on experience as a system administrator
- Deep understanding of Linux distributions (e.g., Red Hat, CentOS, Ubuntu)
- Proficiency in scripting and automation tools (e.g., Bash, Python, Ansible)
- Knowledge of networking concepts and protocols relevant to HPC environments
- Familiarity with configuration management tools (e.g., Puppet, Chef)
- Excellent problem-solving and communication skills
- Ability to work effectively in a team environment and handle multiple priorities
Nice-to-Haves
- NVIDIA training or certifications
- Experience with cloud platforms (e.g., AWS, Azure) for HPC workloads
- Knowledge of InfiniBand networking
- Familiarity with job scheduling systems for HPC (e.g., Slurm, PBS)
- Experience with parallel file systems (e.g., Lustre, BeeGFS)
Education
- Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent work experience
- Relevant certifications such as RHCE (Red Hat Certified Engineer), CompTIA Linux+, or LPIC (Linux Professional Institute Certification) are highly desirable
Additional Requirements
- Must be comfortable working on-site at our customer location in Ashburn, VA
- May be required to participate in on-call rotations
- Willingness to continuously learn and adapt to new technologies in the HPC space
Ready to push the boundaries of high-performance computing? Join our team of skilled professionals and help shape the future of GPU-based HPC solutions! Apply now to become part of a company that's revolutionizing the world of advanced computing.