Senior Data Knowledge Engineer - REMOTE - Rancho BioSciences LLC
Boston, MA
About the Job
Rancho BioSciences is an international company offering data curation, data governance and models, bioinformatics analysis, workflows and pipelines, knowledge mining, target profiles, and business analyst services to clients in pharmaceutical and biotech companies, foundations, government, and hospitals.
Rancho BioSciences is committed to its employees by offering an outstanding and fast paced work environment, which affords them every opportunity to thrive and grow both professionally and personally. We are hiring for the following:
Biomedical Knowledge Graph Data Scientist - Remote
Rancho Biosciences, a Data Science service company, is seeking a Biomedical Knowledge Graph Data Scientist to join our team. The successful candidate will be responsible for developing and optimizing pipelines to utilize knowledge graph embeddings (KGE) in drug discovery. You will be implementing KGE models, assessing their predictive performance and optimizing them.
Responsibilities:
- Design and build scalable data pipelines to implement and execute KGE models on drug discovery knowledge graphs.
- Perform thorough assessments of KGE models to gauge their efficacy in biomedical contexts.
- Create and run tests to explore how various training methodologies and settings affect model performance.
- Apply advanced techniques for optimizing hyperparameters to enhance model precision and adaptability.
- Work in tandem with interdisciplinary groups to ensure KGEs are relevant to practical drug discovery and meet standards for equitable evaluation and replicability.
- Compile and present insights and suggestions for enhancing KGE model assessment practices, contributing to the advancement of biomedical AI knowledge
Qualifications
- Master's or Ph.D. in Data Science, Bioinformatics, Computer Science, or a related field.
- Demonstrated success in developing and overseeing data pipelines and managing extensive datasets.
- Advanced Python programming abilities, particularly with PyTorch and associated libraries like PyG and PyKEEN.
- Practical knowledge of knowledge graphs, machine learning techniques, and graph embedding models, including their real-world applications.
- Acquaintance with biomedical knowledge graph platforms, including but not limited to Disqover, PrimeKG, Hetionet, or BioKG.
- Proven proficiency in fine-tuning parameters and optimizing models, with specific experience in Bayesian optimization methods.
- Superior analytical capabilities and the skill to effectively convey complex concepts to diverse audiences, including both technical experts and non-specialists.
Preferred Qualifications
- Experience with biomedical datasets or in the drug discovery field.
- Familiarity with computational biology and systems pharmacology principles.
- Strong understanding of evaluation metrics and best practices for ensuring model reproducibility in scientific research.
Soft Skills:
- Excellence in managing scientific stakeholder relationships
- Experience translating project requirements to both business process solutions and technical solutions
- Independently driven, hardworking, and committed
- Able to communicate effectively at all levels
- Detail-oriented and well organized, with an ability to work collaboratively and remotely
Rancho BioSciences offers a competitive salary and benefits package.