Sr. Data Engineer - Veracity Software Inc
Charlotte, NC
About the Job
Job Title: Sr. Data Engineer
Duration: 12+ Months
Location: Charlotte, NC – Hybrid Role (3 days onsite in a week)
Manager Notes:
· Team Duties and business impact: The Risk Data Services is a horizontal function within Risk Technology organization and is responsible for delivering data consistently across Risk. The Risk Data Services team is seeking a Lead Software Engineer. Driving strategy for the team for entire platform and help with migration to cloud.
· Big Data Hub, the tools are used purely for ETL functions, moving and conforming data, not for building web applications. The use of Python and Java is for Data Engineering, we are not seeking Web Developers, back and front end etc.
Requirements:
· This is a need for someone to build and write the capabilities.
· Experience building and enhance scala framework and spark experience
· Java
· Python
Job Expectations:
· Design and implement automated spark-based framework to facilitate data ingestion, transformation and consumption.
· Implement security protocols such as Kerberos Authentication, Encryption of data at rest, data authorization mechanism such as role-based access control using Apache ranger.
· Design and develop automated testing framework to perform data validation.
· Enhance existing spark-based frameworks to overcome tool limitations, and/or to add more features based on consumer expectations.
· Design and build high performing and scalable data pipeline platform using Hadoop, Apache Spark, MongoDB, Kafka and object storage architecture.
· Work with Infrastructure Engineers and System Administrators as appropriate in designing the big-data infrastructure
· Collaborate with application partners, Architects, Data Analysts and Modelers to build scalable and performant data solutions.
· Effectively work in a hybrid environment where legacy ETL and Data Warehouse applications and new big-data applications co-exist
· Work with Infrastructure Engineers and System Administrators as appropriate in designing the big-data infrastructure.
· Support ongoing data management efforts for Development, QA and Production environments
· Provide tool support, help consumers troubleshooting pipeline issues.
· Utilizes a thorough understanding of available technology, tools, and existing designs.
· Leverage knowledge of industry trends to build best in class technology to provide competitive advantage.
Required Qualification:
· 5+ years of experience of software engineering experience
· 5+ years of experience delivering complex enterprise wide information technology solutions
· 5+ years of experience delivering ETL, data warehouse and data analytics capabilities on big-data architecture such as Hadoop
· 5+ years of Apache Spark design and development experience using Scala, Java, Python or Data Frames with Resilient Distributed Datasets (RDDs), Parquet or ORC file formats
· 6+ years of ETL (Extract, Transform, Load) Programming experience
· 2+ years of Kafka or equivalent experience
· 2+ years of NoSQL DB like Couchbase/MongoDB experience.
· 5+ experience working with complex SQLs and performance tuning
Desired Qualification:
· 3+ years of Agile experience
· 2+ years of reporting experience, analytics experience or a combination of both
· 2+ years of operational risk or credit risk or compliance domain experience
· 2+ years of experience integrating with RESTful API
· 2+ years of experience with CICD tools.