Senior Data Systems Engineer - Shtudy
Dallas, TX
About the Job
We are seeking a highly skilled Senior Data Engineer with strong Quality Assurance (QA) expertise to join our dynamic team for a confidential client. This hybrid role requires two days of onsite work in Dallas, TX, with the remainder remote. The ideal candidate will possess extensive experience in Python development, cloud technologies (preferably AWS), and data engineering practices, alongside a proven background in data testing and ETL process automation.
This is a highly competitive opportunity, offering the chance to work with cutting-edge cloud technologies and play a pivotal role in a growing team.
Key Responsibilities:
- Automate ETL pipelines using Python to streamline data integration, transformation, and migration processes.
- Utilize AWS cloud technologies, including S3, Athena, EMR, Glue, Redshift, Kinesis, and SageMaker, to build robust data infrastructure.
- Collaborate with cross-functional teams to ensure data architecture and pipelines are scalable, efficient, and robust.
- Develop, test, and maintain ETL processes in both cloud and on-prem environments using tools such as AWS Glue, Informatica, Ab Initio, or Alteryx.
- Lead data migration efforts from on-premise to cloud environments, ensuring data integrity and consistency.
- Perform data analytics, integration testing, and data quality validation within project timelines and budgets.
- Implement DevOps/DataOps practices, including CI/CD pipelines for data integration and testing workflows.
- Write complex SQL queries and utilize Unix/Linux scripting to manipulate and analyze data effectively.
- Support machine learning teams with data preparation and model deployment on platforms such as SageMaker or H2O.
- Apply best practices for ETL testing strategies, including the creation and execution of test plans and automated testing frameworks.
Mandatory Skills & Qualifications:
- 11-12 years of experience in data engineering, with substantial exposure to cloud technologies (preferably AWS) and data quality assurance.
- Proficiency in Python for automating ETL processes, data integration, and scripting.
- Extensive hands-on experience with AWS services such as S3, Athena, EMR, Glue, Redshift, Kinesis, and SageMaker.
- Expertise in SQL and Unix/Linux for data querying, scripting, and troubleshooting.
- Experience in ETL automation across cloud and on-prem environments using tools like AWS Glue, Informatica, Ab Initio, and Alteryx.
- Proven expertise in data migration from on-premise to cloud platforms.
- Strong background in DevOps/DataOps environments, particularly in developing CI/CD pipelines for data engineering.
- Skilled in data modeling, data warehousing, and data integration strategies.
- Experience in data analytics and interpreting data from multiple sources for integration.
- Knowledge of ETL testing strategies and best practices to ensure data integrity and quality.
Preferred Additional Skills:
- Familiarity with machine learning frameworks such as SageMaker, Machine Learning Studio, or H2O for predictive models and insights.
- Knowledge of Agile methodologies and experience working in fast-paced, collaborative environments.
- Experience in creating and managing automated test scripts and frameworks for ETL processes.
- Expertise in managing and troubleshooting data pipelines in large-scale distributed environments.
- Strong communication skills and ability to work collaboratively with cross-functional teams.
Education:
- Bachelor’s degree or higher in Computer Science, Data Engineering, Information Technology, or a related field.
Powered by JazzHR
Source : Shtudy