Sr SDM, Machine Learning Acceleration, AWS Neuron - Amazon Web Services
Cupertino, CA
About the Job
Description
AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine
learning accelerators and the Trn1 and Inf1 servers that use them. As the Senior Manager of Software Development for the ML Applications team, you will be responsible for leading a strong team of engineers and managers to help design and deploy these new products. A successful candidate will have an established background in
developing Machine Learning products with direct customer-facing experience, a strong technical ability and a motivation to achieve results. This role includes
management of other managers so this experience is required. Experience in Machine Learning and software development is also a must.
About AWS
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud
platform. We pioneered cloud computing and never stopped innovating — that’s why customers
from the most successful startups to Global 500 companies trust our robust suite of products and
services to power their businesses.
Inclusive Team Culture
Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster
a culture of inclusion that empower us to celebrate our differences. Ongoing events and learning
experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender
diversity) conferences, inspire us to never stop embracing our uniqueness.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of
sacrifices at home, which is why flexible work hours and arrangements are part of our culture.
When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the
cloud.
Mentorship & Career Growth
We have a career path for you no matter what stage you’re in when you start here. We’re continuously raising our performance bar as we strive to become Earth’s Best Employer.
That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing
resources here to help you develop into a better-rounded professional.
Key job responsibilities
- Responsible for the full development life cycle of our integrations and extensions for training support in Pytorch, XLA, Tensorflow as well as distributed training libraries like FSDP, DDP and others.
- Characterization, enablement and development of existing and future massive-scale ML models like GPT3 as well as BERT, ViT, Stable Diffusion and more
- Lead the way to ensure support for key ML functionality in a combined chip / software platform
- Ensure the right thing is being built and delivered to customers
A day in the life
You will work with the executive leadership and other senior management and technical leaders to define product directions and deliver them to customers. We build massive-scale distributed training and inference solutions. This organization builds the full stack of software, servers and chips to accelerate at the highest scale.
We are open to hiring candidates to work out of one of the following locations:
Cupertino, CA, USA
BASIC QUALIFICATIONS
- 10+ years of engineering experience
- 5+ years of engineering team management experience
- 10+ years of planning, designing, developing and delivering consumer software experience
- Experience partnering with product or program management teams
- Experience managing multiple concurrent programs, projects and development teams in an Agile environment
PREFERRED QUALIFICATIONS
- Experience partnering with product and program management teams
- Experience designing and developing large scale, high-traffic applications
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $176,100/year in our lowest geographic market up to $342,300/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. Applicants should apply via our internal or external career site.
AWS Neuron is the complete software stack for the AWS Inferentia and Trainium cloud-scale machine
learning accelerators and the Trn1 and Inf1 servers that use them. As the Senior Manager of Software Development for the ML Applications team, you will be responsible for leading a strong team of engineers and managers to help design and deploy these new products. A successful candidate will have an established background in
developing Machine Learning products with direct customer-facing experience, a strong technical ability and a motivation to achieve results. This role includes
management of other managers so this experience is required. Experience in Machine Learning and software development is also a must.
About AWS
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud
platform. We pioneered cloud computing and never stopped innovating — that’s why customers
from the most successful startups to Global 500 companies trust our robust suite of products and
services to power their businesses.
Inclusive Team Culture
Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster
a culture of inclusion that empower us to celebrate our differences. Ongoing events and learning
experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender
diversity) conferences, inspire us to never stop embracing our uniqueness.
Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of
sacrifices at home, which is why flexible work hours and arrangements are part of our culture.
When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the
cloud.
Mentorship & Career Growth
We have a career path for you no matter what stage you’re in when you start here. We’re continuously raising our performance bar as we strive to become Earth’s Best Employer.
That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing
resources here to help you develop into a better-rounded professional.
Key job responsibilities
- Responsible for the full development life cycle of our integrations and extensions for training support in Pytorch, XLA, Tensorflow as well as distributed training libraries like FSDP, DDP and others.
- Characterization, enablement and development of existing and future massive-scale ML models like GPT3 as well as BERT, ViT, Stable Diffusion and more
- Lead the way to ensure support for key ML functionality in a combined chip / software platform
- Ensure the right thing is being built and delivered to customers
A day in the life
You will work with the executive leadership and other senior management and technical leaders to define product directions and deliver them to customers. We build massive-scale distributed training and inference solutions. This organization builds the full stack of software, servers and chips to accelerate at the highest scale.
We are open to hiring candidates to work out of one of the following locations:
Cupertino, CA, USA
BASIC QUALIFICATIONS
- 10+ years of engineering experience
- 5+ years of engineering team management experience
- 10+ years of planning, designing, developing and delivering consumer software experience
- Experience partnering with product or program management teams
- Experience managing multiple concurrent programs, projects and development teams in an Agile environment
PREFERRED QUALIFICATIONS
- Experience partnering with product and program management teams
- Experience designing and developing large scale, high-traffic applications
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $176,100/year in our lowest geographic market up to $342,300/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. Applicants should apply via our internal or external career site.
Source : Amazon Web Services