Senior Distributed Acceleration Engineer, RAPIDS at NVIDIA
Los Angeles, CA 90079
About the Job
NVIDIA is looking to hire a Senior Distributed Acceleration Engineer to work on RAPIDS, a suite of open-source software libraries that accelerates end-to-end data science and analytics pipelines on GPUs. RAPIDS relies on NVIDIA CUDA for low-level compute optimization but exposes high-performance GPU computing through user-friendly Python interfaces. We’re rapidly growing the team passionate about building and optimizing how RAPIDS can leverage multiple GPUs for distributed execution. The team’s charter is to explore, develop, and architect multi-GPU engines for RAPIDS (GPU ETL, ML) based workflows with an emphasis on single-node multi-GPU configurations.
In this role, you will develop, benchmark, and explore novel tuned custom solutions, as well as existing open-source engines, like Dask, Ray, Spark, and more, which can achieve high-performance goals for multi-GPU workloads. This is a great chance to take advantage of your distributed systems knowledge, CUDA C++, and Python programming skills. You’ll work closely with the RAPIDS group of stellar engineers building highly optimized multi-GPU CUDA libraries.
What you'll be doing:
Analyze, design, and implement optimized GPU algorithms for large-scale data analytics and machine learning
Architect and implement distributed GPU algorithms for dense multi-GPU single-node machines and more generally for multi-GPU multi-node environments
Expand and improve integration of RAPIDS into relevant high-level frameworks
Drive performance analysis, benchmarking, and troubleshooting of associated libraries.
Collaborate with a multi-functional team to understand requirements and implement or improve solutions
What we need to see:
MS or PhD in Computer Science, Computer Engineering or Electrical Engineering or related field in Deep Learning, Machine Learning, and Computer Vision or equivalent experience.
5+ years of proven experience in Computer Science, Artificial Intelligence, Applied Math, or related field
Strong analytical problem-solving skills, algorithms, and mathematics fundamentals.
Distributed System experience and development
Excellent software development skills: programming, debugging, performance analysis, and test design
Good communication and documentation habits.
Ability to work independently and manage your own development efforts.
Ways to stand out from the crowd:
Experience developing distributed algorithms and running on distributed systems: HPC, Cloud, etc
Experience with debugging multi-language and multi-hardware systems
Experience with the PyData Stack: NumPy, Pandas, Scikit-Learn, Dask,
Prior work on open-source projects
GPU programming knowledge is a plus, but if you don’t have it, we’re happy to teach you
With a competitive salary package and benefits, NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us. Are you a creative and autonomous Distributed Acceleration Engineer, who loves challenges? Do you have a genuine passion for advancing the state of GPU and CPU across a variety of industries? If so, we want to hear from you.
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA continuously looks for great people like you to help us accelerate the next wave of accelerated computing.
The base salary range is 148,000 USD - 276,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
#deeplearning