Senior Machine Learning Engineer - Data Curation - AIGC, TikTok Monetization GenAI at TikTok
San Jose, CA 95101
About the Job
DescriptionTikTok is the leading destination for short-form mobile video
Our mission is to inspire creativity and bring joy
TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo
Why Join UsCreation is the core of TikTok's purpose
Our platform is built to help imaginations thrive
This is doubly true of the teams that make TikTok possible
Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day
To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team
Status quo? Never
Courage? Always
At TikTok, we create together and grow together
That's how we drive impact - for ourselves, our company, and the communities we serve
Join us.We are Generative AI team under Monetization Technology
Our team focuses on developing cutting-edge Generative AI techs across all modalities, including text, image, videos, landing pages, etc., and creates industry-leading technical solutions to improve creative efficiency for advertisers, agencies and creators
We are committed to automated creative workflows by leveraging Generative AI technologies, to increase overall revenue for advertisers, agencies and creators.We aim to drive and lead the generative AI in the ads tech and creative industry, powering products and driving values for our clients, creators, and the whole ecosystem
We are looking for infrastructure engineers who are excited to grow their business understanding, build highly scalable and reliable software/infrastructure, partner across functions with global teams, and make big impacts
If you are someone who welcomes challenges, we are eager to have you on the team!Responsibilities: - Collaborate with foundational model researchers, including specialists in Ads LLM, Text-to-Image, and Text-to-Video, to develop and maintain efficient, low-latency data pipelines.- Design and implement robust, scalable systems for data curation and management, supporting the foundational training of models across various formats in distributed environments.- Implement data insights and model evaluation pipelines to enhance user engagement and drive revenue growth.- Develop caching mechanisms to improve data retrieval speeds and enhance model responsiveness.- Stay abreast of the latest academic research and open-source advancements, integrating cutting-edge technologies to continuously improve our data operations and machine learning model performance.QualificationsMinimum Qualifications:1
B.S./M.S./Ph.D
in Computer Science, Computer Engineering, or a related field.2
Programming and Technical Proficiency: Expertise in Python and a strong foundation in deep learning frameworks, such as PyTorch, as well as large model training libraries like FSDP/DeepSpeed and asyncio
A minimum of 3 years' experience with Linux, Docker, and Kubernetes is required.3
Data Engineering and AI/ML Knowledge: Demonstrated capability in data curation, management, and optimization within Generative AI ecosystems, encompassing both streaming and batch data processing
This includes a thorough understanding of machine learning frameworks, parallel data processing techniques, and proficiency with large language models (e.g., Llama series), text to image (e.g., Diffusion-Based Models, Diffusion Transformers), and text to video technologies (e.g., EMU series, MagViT).Preferred Qualifications:1
Advanced Technical Expertise: Experience in CUDA Optimization and a deep understanding of the application of Generative AI models across multiple domains.2
Cloud Computing and Distributed Systems: Significant experience in managing large-scale data systems, with a strong preference for those who have worked with Vector Database solutions
Proficiency in cloud services (AWS/GCP) and familiarity with machine learning training, deployment, and distributed computing frameworks like Spark.3
Interpersonal and Problem-Solving Skills: A demonstrated passion for technology, coupled with outstanding problem-solving capabilities
Exceptional communication, teamwork, and project management skills are essential, along with a resilient character.TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives
Our platform connects people from across the globe and so does our workplace
At TikTok, our mission is to inspire creativity and bring joy
To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach
We are passionate about this and hope you are too.TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws
If you need assistance or a reasonable accommodation, please reach out to us at
Our mission is to inspire creativity and bring joy
TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo
Why Join UsCreation is the core of TikTok's purpose
Our platform is built to help imaginations thrive
This is doubly true of the teams that make TikTok possible
Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day
To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team
Status quo? Never
Courage? Always
At TikTok, we create together and grow together
That's how we drive impact - for ourselves, our company, and the communities we serve
Join us.We are Generative AI team under Monetization Technology
Our team focuses on developing cutting-edge Generative AI techs across all modalities, including text, image, videos, landing pages, etc., and creates industry-leading technical solutions to improve creative efficiency for advertisers, agencies and creators
We are committed to automated creative workflows by leveraging Generative AI technologies, to increase overall revenue for advertisers, agencies and creators.We aim to drive and lead the generative AI in the ads tech and creative industry, powering products and driving values for our clients, creators, and the whole ecosystem
We are looking for infrastructure engineers who are excited to grow their business understanding, build highly scalable and reliable software/infrastructure, partner across functions with global teams, and make big impacts
If you are someone who welcomes challenges, we are eager to have you on the team!Responsibilities: - Collaborate with foundational model researchers, including specialists in Ads LLM, Text-to-Image, and Text-to-Video, to develop and maintain efficient, low-latency data pipelines.- Design and implement robust, scalable systems for data curation and management, supporting the foundational training of models across various formats in distributed environments.- Implement data insights and model evaluation pipelines to enhance user engagement and drive revenue growth.- Develop caching mechanisms to improve data retrieval speeds and enhance model responsiveness.- Stay abreast of the latest academic research and open-source advancements, integrating cutting-edge technologies to continuously improve our data operations and machine learning model performance.QualificationsMinimum Qualifications:1
B.S./M.S./Ph.D
in Computer Science, Computer Engineering, or a related field.2
Programming and Technical Proficiency: Expertise in Python and a strong foundation in deep learning frameworks, such as PyTorch, as well as large model training libraries like FSDP/DeepSpeed and asyncio
A minimum of 3 years' experience with Linux, Docker, and Kubernetes is required.3
Data Engineering and AI/ML Knowledge: Demonstrated capability in data curation, management, and optimization within Generative AI ecosystems, encompassing both streaming and batch data processing
This includes a thorough understanding of machine learning frameworks, parallel data processing techniques, and proficiency with large language models (e.g., Llama series), text to image (e.g., Diffusion-Based Models, Diffusion Transformers), and text to video technologies (e.g., EMU series, MagViT).Preferred Qualifications:1
Advanced Technical Expertise: Experience in CUDA Optimization and a deep understanding of the application of Generative AI models across multiple domains.2
Cloud Computing and Distributed Systems: Significant experience in managing large-scale data systems, with a strong preference for those who have worked with Vector Database solutions
Proficiency in cloud services (AWS/GCP) and familiarity with machine learning training, deployment, and distributed computing frameworks like Spark.3
Interpersonal and Problem-Solving Skills: A demonstrated passion for technology, coupled with outstanding problem-solving capabilities
Exceptional communication, teamwork, and project management skills are essential, along with a resilient character.TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives
Our platform connects people from across the globe and so does our workplace
At TikTok, our mission is to inspire creativity and bring joy
To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach
We are passionate about this and hope you are too.TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws
If you need assistance or a reasonable accommodation, please reach out to us at
https://shorturl.at/cdpT2RegularExperienced