Cloud Data Engineer - TalentRemedy
Washington, DC 20001
About the Job
Our client is looking for a talented Cloud Data Engineer to join their amazing team!
If you're looking to join a company that truly appreciates you and your talents, look no further! Our client is committed to serving and caring for their colleagues, clients, and community. Their team is made up of talented individuals who appreciate having the opportunity to contribute their knowledge and experience to further the growth and development of their industry. Our client's ideal candidates embrace diverse thinking, enjoy partnering with others, and are seeking to make a difference!
Key Responsibilities:
- Collaborate & contribute to the architecture, design, development, and maintenance of large-scale data & analytics platforms, system integrations, data pipelines, data models & API integrations.
- Prototype emerging business use cases to validate technology approaches and propose potential solutions.
- Data Pipeline Development: Design, develop, and maintain data pipelines using Databricks, Apache Spark, and other cloud-based technologies to ingest, transform, and load data from various financial institutions and sources.
- Data Transformation: Implement data transformation processes to ensure data quality, integrity, and consistency, meeting regulatory standards. Create transformation path for data to migrate from on-prem pipelines and sources to AWS.
- Data Integration: Integrate data from diverse sources, including financial databases, APIs, regulatory reporting systems, and internal data stores, into the CFPB's data ecosystem.
- Data Modeling: Develop and optimize data models for regulatory analysis, reporting, and compliance, following data warehousing and data lake principles.
- Performance Optimization: Monitor and optimize data pipelines for efficiency, scalability, and cost-effectiveness while ensuring data privacy and security.
- Data Governance: Ensure data governance and regulatory compliance, maintaining data lineage and documentation for audits and reporting purposes.
- Collaboration: Collaborate with cross-functional teams, including data analysts, legal experts, and regulatory specialists, to understand data requirements and provide data support for regulatory investigations.
- Documentation: Maintain comprehensive documentation for data pipelines, code, and infrastructure configurations, adhering to regulatory compliance standards.
- Troubleshooting: Identify and resolve data-related issues, errors, and anomalies to ensure data reliability and compliance with regulatory requirements.
- Continuous Learning: Stay updated with regulatory changes, industry trends, cloud technologies, and Databricks advancements to implement best practices and improvements in data engineering.
Disclaimer "The responsibilities and duties outlined in this job description are intended to describe the general nature and level of work performed by employees within this role. However, they are not exhaustive and may be subject to change or modification at any time to meet the evolving needs of the organization.
Requirements
Clearance requirements:
- Must be able to obtain and maintain a Public Trust clearance.
- Must be a verifiable US Citizen for this Federal support position.
Minimum Qualifications:
- U.S. Citizenship is required, no exceptions per Federal Requirements
- Bachelor's or higher degree in computer science, data engineering, or a related field
- The Databricks Certified Data Engineer Professional certification is required. NO EXCEPTIONS!
- Strong understanding of data lake, lakehouse, and data warehousing architectures in a cloud-based environment.
- Hands-on experience with Databricks including data ingestion, transformation, and analysis
- Proficiency in Python for data manipulation, scripting, and automation
- In-depth knowledge of AWS services relevant to data engineering such as Amazon S3, EC2, Database Migration Service (DMS), DataSync, EKS, CLI, RDS, Lambda, etc.
- Understanding of data integration patterns and technologies.
- Proficiency designing and building flexible and scalable ETL processes and data pipelines using Python and/or PySpark and SQL.
- Proficiency in data pipeline automation and workflow management tools like Apache Airflow or AWS Step Functions.
- Knowledge of data quality management and data governance principles.
- Strong problem-solving and troubleshooting skills related to data management challenges.
- Experience managing code in GitHub or other similar tools.
- Experience leveraging Postgres in a parallel processing environment.
- Hands-on experience migrating from an on-premise data platform(s) to a modern cloud environment (e.g. AWS, Azure, GCP).
- Excellent problem-solving and communication skills.
- Strong attention to detail and the ability to work independently and collaboratively.
Preferred Skills:
- Experience with financial data or regulatory data management.
- Experience working in Agile or DevSecOps environments and using related tools for collaboration and version control.
- Knowledge of regulatory frameworks in the financial industry.
- Familiarity with DevOps and CI/CD practices.
- Experience with machine learning and AI technologies.
A Cloud Data Engineer with Databricks plays a vital role in ensuring data accuracy, integrity, and compliance with regulatory standards, supporting the agency's mission to protect consumers in the financial sector. The role demands expertise in Databricks, cloud technologies, and a deep understanding of data engineering principles within a regulatory context.
Benefits
Applicants selected will be subject to a government security investigation and must meet eligibility requirements for access to classified information. Our client is an Equal Opportunity Employer, Minorities/Females/Veterans/Disabled. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, or national origin.
Candidates must be willing to consent to a background check including a criminal record check, employment, and education verification. No Exceptions!
Minimum Qualifications: U.S. Citizenship is required, per Federal Requirements Bachelor's or higher degree in computer science, data engineering, or a related field The Databricks Certified Data Engineer Professional certification is required. Minimum of 3 years of experience in the following: Strong understanding of data lake, lakehouse, and data warehousing architectures in a cloud-based environment. Hands-on experience with Databricks including data ingestion, transformation, and analysis Proficiency in Python for data manipulation, scripting, and automation In-depth knowledge of AWS services relevant to data engineering such as Amazon S3, EC2, Database Migration Service (DMS), DataSync, EKS, CLI, RDS, Lambda, etc. Understanding of data integration patterns and technologies. Proficiency designing and building flexible and scalable ETL processes and data pipelines using Python and/or PySpark and SQL. Proficiency in data pipeline automation and workflow management tools like Apache Airflow or AWS Step Functions. Knowledge of data quality management and data governance principles. Strong problem-solving and troubleshooting skills related to data management challenges. Experience managing code in GitHub or other similar tools. Experience leveraging Postgres in a parallel processing environment. Hands-on experience migrating from an on-premise data platform(s) to a modern cloud environment (e.g. AWS, Azure, GCP). Excellent problem-solving and communication skills. Strong attention to detail and the ability to work independently and collaboratively.