Data Engineer - Artech LLC
Morrisville, NC 27560
About the Job
Job Title: Data Engineer
Duration: 12+ Months
Location: Morrisville, NC
In addition to English, the candidate is required to be proficient one or more of the following languages: German, Italian, French and Portuguese.
Summary:
Responsibilities:
Duration: 12+ Months
Location: Morrisville, NC
In addition to English, the candidate is required to be proficient one or more of the following languages: German, Italian, French and Portuguese.
Summary:
- We are excited to offer a few data engineer positions in Large Language Model and Multi-modality LLM field, specifically in European languages.
- The goal is to work with the team on the data part to help build strong multi-lingual AI models.
- In addition to English, the candidate is required to be proficient one or more of the following languages: German, Italian, French and Portuguese.
Responsibilities:
- Develop and maintain web scraping and data extraction processes to gather large-scale text and image data from diverse sources.
- Clean, preprocess, and tag text and image data to ensure data quality and usability.
- Work with different data formats such as Parquet, JSONL, and CSV, ensuring efficient data storage and retrieval.
- Collaborate with data scientists and machine learning engineers to support the evaluation and improvement of large language models.
- Stay up to date with the latest research and advancements in the field of data engineering, web scraping, and machine learning. Actively participate in academic research and reading groups.
- Implement and optimize data pipelines for high-volume data processing.
- Strong proficiency in Python and solid understanding of HTML, JSON, and web technologies.
- Master degree required , and 2-4 years of experience.
Source : Artech LLC