Data Scientist (AWS & SageMaker Experience) - Hexaware USA
Reston, VA
About the Job
• Design, develop, and deploy machine learning models using Amazon SageMaker and other AWS services.
• Analyze large datasets to extract actionable insights and build predictive models.
• Perform end-to-end data science tasks, including data collection, preprocessing, model training, validation, and optimization.
• Collaborate with engineering and DevOps teams to implement scalable and reliable cloud-based machine learning pipelines.
• Conduct A/B testing and deploy machine learning models to production environments.
• Develop and maintain data pipelines using AWS Glue, Lambda, S3, RDS, and other related services.
• Monitor and evaluate model performance and continuously optimize for accuracy and efficiency.
• Prepare and present detailed reports and visualizations for stakeholders to drive business decisions.
• Stay up to date with the latest advancements in data science, cloud computing, and machine learning technologies.
Required Qualifications:
• Bachelor’s/Master’s degree in Data Science, Computer Science, Statistics, or a related field.
• 8 years of experience working as a Data Scientist, preferably in a cloud environment.
• Strong expertise in AWS services: SageMaker, S3, EC2, Lambda, Glue, Redshift, etc.
• Hands-on experience with Amazon SageMaker for building, training, and deploying machine learning models.
• Proficiency in Python and relevant machine learning libraries such as TensorFlow, PyTorch, Scikit-learn, and Pandas.
• Experience in building, deploying, and monitoring machine learning models in production environments.
• Strong knowledge of SQL and working with large relational databases.
• Familiarity with DevOps practices and model deployment strategies in cloud environments.
• Experience in model interpretability and tuning machine learning models.
• Strong problem-solving skills and ability to work with complex datasets.
Preferred Qualifications:
• Experience with AWS CloudFormation, AWS Step Functions, or Terraform for infrastructure automation.
• Familiarity with big data tools like Apache Spark, Hadoop, or EMR.
• Knowledge of advanced ML techniques such as deep learning, reinforcement learning, or natural language processing (NLP).
• Understanding of MLOps and CI/CD pipelines for machine learning models.
• Experience with data visualization tools such as Tableau, QuickSight, or Power BI.