Principal Data Scientist

Wellcome Sanger Institute
Saffron Walden
1 day ago
Applications closed

Related Jobs

View all jobs

Principal Data Scientist

Principal Data Scientist

Principal Data Scientist

Principal Data Scientist

Principal Data Scientist: Recommender & Personalization Lead

Principal Data Scientist — Healthcare AI Leader

Do you want to help us improve human health and understand life on Earth? Make your mark by shaping the future to enable or deliver life‑changing science to solve some of humanity’s greatest challenges.


Principal Research Data Scientist

We seek a Principal Machine Learning Research Data Scientist to join a collaborative project between the Wellcome Sanger Institute and Open Targets. The project aims to leverage datasets internally generated at the Sanger Institute and publicly available data from human cells to create foundational models for biology, enhancing our understanding of life’s rules and improving health for all. You will work within an interdisciplinary team of life scientists and computer / ML scientists, with a shared objective of advancing biological research through these foundational models. This role will sit within the AI / ML Faculty group led by Dr. Mohammad Lotfollahi, and the successful candidates, across different seniority levels, will be responsible for delivering their portfolio of scientific research projects as part of the broader team strategy.


About the Role

Your role will involve designing foundational models leveraging multi‑modal readouts. This includes integrating and processing data from various sources to develop robust and versatile AI models. To achieve this, you will work with open‑source software, proposing, developing, and maintaining new solutions to analyze and interpret large‑scale single‑cell datasets. We have access to unique data and are also in the position to generate data to train unique models. Additionally, we have substantial computational power and GPU resources to train large models efficiently.


Our teams are well‑positioned to tackle this problem with experience in both generating and analyzing datasets, including millions of cells across multiple tissues and conditions (e.g., disease, healthy). This involves a detailed understanding of the training of large‑scale ML models and a track record of undertaking large data‑science projects.


Responsibilities

  • Independently manage and lead machine learning research projects and write outcomes in a scientific publication for submission to journals or machine learning conferences (ICLR, ICML, CVPR, etc).
  • Collaborate with team members, propose, develop, and evaluate new machine learning models that enable understanding single‑cell data and its application in drug discovery.
  • Work with Ph.D. students and postdocs in collaborating teams on developing solutions for interdisciplinary scientific problems in biology, providing supervision and training to junior members of the team.
  • Contribute to writing scientific papers on biotechnology and biology.
  • Distill your developed solutions into open‑source and easy‑to‑install packages with documentation that facilitates the usage of your solution for downstream users, including biologists and bioinformaticians.
  • Present your research and analysis pipelines to internal and external audiences.

About You

You will be supported in your personal and professional development and have the opportunity to lead peer‑reviewed publications around using genetics and genomics approaches to guide drug discovery and present them at national and international conferences.


Essential Skills

  • Ph.D. or M.Sc. with equivalent research experience in a relevant quantitative discipline (e.g., Computer Science, Computational Biology, Genetics, Bioinformatics, Physics, Engineering, or Applied Statistics / Mathematics).
  • Previous ML work experience in scientific / academic environment (RA / Internships are considered as work experience).
  • Strong knowledge of Python, including core data science libraries such as Scikit‑Learn, SciPy, TensorFlow, and PyTorch.
  • Expertise in machine learning algorithms and frameworks, with experience in designing, training, and deploying ML models.
  • Proficiency in handling and processing large datasets, including techniques for data cleaning, feature engineering, and data augmentation.
  • Experience with high‑performance computing environments, including the use of GPUs for training large‑scale machine learning models.
  • Experience in natural language processing (NLP) and training models based on transformer architectures, such as BERT and GPT.
  • Familiarity with generative models such as diffusion models and flow matching.
  • Knowledge of software development good practices and collaboration tools, including git‑based version control, Python package management, and code reviews.
  • Strong problem‑solving skills with the ability to analyze complex data and derive actionable insights.
  • Excellent communication skills, with the ability to explain complex machine learning algorithms and statistical methods to non‑technical stakeholders.
  • Evidence of related work experience as a researcher in the area of Machine learning.
  • Strong publication record, first author position ideal.

Additional Skills

  • Ability to quickly understand scientific, technical, and process challenges and break down complex problems into actionable steps.
  • Ability to work in a frequently changing environment with the capability to interpret management information to amend plans.
  • Ability to prioritize, manage workload, and deliver agreed activities consistently on time.
  • Demonstrate good networking, influencing and relationship building skills.
  • Strategic thinking is the ability to see the “bigger picture.”
  • Ability to build collaborative working relationships with internal and external stakeholders at all levels.
  • Demonstrates inclusivity and respect for all.

Relevant Publication of the Group

  • Lotfollahi, M., Naghipourfar, M., Luecken, M. D., Khajavi, M., Büttner, M., Wagenstetter, M., Avsec, Ž., Gayoso, A., Yosef, N., Interlandi, M. & Others. Mapping single‑cell data to reference atlases by transfer learning. Nature Biotechnology 1–10.
  • Lotfollahi, M., Wolf, F. A. & Theis, F. J. scGen predicts single‑cell perturbation responses. Nature Methods 16, 715–721.
  • Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh‑Zadeh, S., Talavera‑López, C., Misharin, A. V. & Theis, F. J. Biologically informed deep learning to query gene programs in single cell atlases. Nature Cell Biology.


#J-18808-Ljbffr

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Neurodiversity in Machine Learning Careers: Turning Different Thinking into a Superpower

Machine learning is about more than just models & metrics. It’s about spotting patterns others miss, asking better questions, challenging assumptions & building systems that work reliably in the real world. That makes it a natural home for many neurodivergent people. If you live with ADHD, autism or dyslexia, you may have been told your brain is “too distracted”, “too literal” or “too disorganised” for a technical career. In reality, many of the traits that can make school or traditional offices hard are exactly the traits that make for excellent ML engineers, applied scientists & MLOps specialists. This guide is written for neurodivergent ML job seekers in the UK. We’ll explore: What neurodiversity means in a machine learning context How ADHD, autism & dyslexia strengths map to ML roles Practical workplace adjustments you can ask for under UK law How to talk about neurodivergence in applications & interviews By the end, you’ll have a clearer sense of where you might thrive in ML – & how to turn “different thinking” into a genuine career advantage.

Machine Learning Hiring Trends 2026: What to Watch Out For (For Job Seekers & Recruiters)

As we move into 2026, the machine learning jobs market in the UK is going through another big shift. Foundation models and generative AI are everywhere, companies are under pressure to show real ROI from AI, and cloud costs are being scrutinised like never before. Some organisations are slowing hiring or merging teams. Others are doubling down on machine learning, MLOps and AI platform engineering to stay competitive. The end result? Fewer fluffy “AI” roles, more focused machine learning roles with clear ownership and expectations. Whether you are a machine learning job seeker planning your next move, or a recruiter trying to build ML teams, understanding the key machine learning hiring trends for 2026 will help you stay ahead.

Machine Learning Recruitment Trends 2025 (UK): What Job Seekers Need To Know About Today’s Hiring Process

Summary: UK machine learning hiring has shifted from title‑led CV screens to capability‑driven assessments that emphasise shipped ML/LLM features, robust evaluation, observability, safety/governance, cost control and measurable business impact. This guide explains what’s changed, what to expect in interviews & how to prepare—especially for ML engineers, applied scientists, LLM application engineers, ML platform/MLOps engineers and AI product managers. Who this is for: ML engineers, applied ML/LLM engineers, LLM/retrieval engineers, ML platform/MLOps/SRE, data scientists transitioning to production ML, AI product managers & tech‑lead candidates targeting roles in the UK.