Data Engineer

Wellcome Sanger Institute
Saffron Walden
2 months ago
Applications closed

Related Jobs

View all jobs

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Do you want to help us improve human health and understand life on Earth? Make your mark by shaping the future to enable or deliver life-changing science to solve some of humanity’s greatest challenges.


About the Role

We are seeking to recruit a passionate Data Engineer to help us develop the solutions for the world’s key cancer genomics resource, in COSMIC (Catalogue of Somatic Mutations of Cancer). We invite individuals with backgrounds in computer science or bioinformatics to join us in this exciting opportunity to contribute to cancer genomics. You will be directly involved with exciting projects, implement tools and processes in the cloud environment, to support the world’s key cancer genomics resource, COSMIC (Catalogue of Somatic Mutations of Cancer).


This role is needed to enable the rapid growth and evolution of COSMIC’s large-scale biological datasets and products, helping to advance cancer research globally and ultimately improve outcomes for patients.


Data is currently gathered from a variety of sources, from manual curation to structured repositories, and standardised into the COSMIC database before release via our analytic website. This genetic and genomic data is increasing in quantity and scope, its value is enhanced by analytic highlights, and this position will improve the handling, annotation, meaningful presentation and user access of this information. More specifically in this role, we provide excellent opportunities to apply best practices in software development, writing clean code, and taking a test-driven development approach.


About You

You will support the development of new automated pipelines and workflows for data curation and release of COSMIC products. Participate in developing solutions for the storage and production pipeline of vast genomic data in the cloud environment.


While a background in biology or genomics is beneficial, it is not essential. We value candidates who are eager to learn and engage with a wide range of cancer genomics datasets. The role involves maintaining and supporting automated pipelines and workflows on cloud platforms.


You must be a good communicator and able to understand the goals and aims of the collaborative COSMIC team and its customers of researchers, scientists and clinicians, globally. You will have experience in developing software and analytical tools across large datasets, ensuring throughput and quality.


You will contribute in bringing new perspectives on options for storage, pipelining and visualisation. You will possess a degree in Computer Science or Bioinformatics, and demonstrate expertise in a modern programming language such as Python and (Postgres/Oracle) SQL, within a Unix/Linux environment.


This position would be well-suited for someone with experience in software development. You will join a multidisciplinary, committed and supportive team with the opportunity to work in conjunction with software developers, bioinformaticians, curators and scientists as well as colleagues and collaborators both in Sanger and externally.


Essential Technical Skills

  • Degree or equivalent experience in Software development/Computer Science or Bioinformatics
  • Proven experience in at least one modern programming language, preferably Python or PERL
  • Proven experience in one or more scripting languages preferably Bash or Shell
  • Expertise across large and complex relational databases, proficient in SQL
  • An understanding of FAIR (findability, accessibility, interoperability, and reusability) data
  • Experience in Data handling skills
  • Experience or willingness to learn how to create/maintain data processing pipelines
  • Experience with modern coding practices: Git, Agile
  • Knowledge or experience in data warehousing architecture and concepts (ETL and ELT) and data modelling
  • Experience or willingness to learn development best practices in DevOps, Containers, CI/CD, and TDD

Competencies And Behaviours

  • A demonstrable, enthusiastic, can-do, learn, proactive attitude
  • Enthusiasm, commitment and attention to detail
  • Ability to prioritise activities and manage the own workload independently
  • Ability to explain technical issues effectively and understandably to non-technical users
  • Ability to work collaboratively with a range of stakeholders at all levels
  • Ability to understand scientific and technical challenges
  • Shows curiosity and a willingness to learn new technologies, tools, and ways of working
  • Adjusts to changing requirements, priorities, or technical approaches
  • Welcomes feedback from peers and seniors
  • Collaborates effectively with team members and contributes to shared goals

About Us

COSMIC is the key information source in global human cancer research and is growing rapidly in content, scope, and value. We exist to speed up the field of precision oncology so that every cancer journey can be tailored, accurate, and precise. We do this by streamlining and simplifying genomics data analysis for better prevention, diagnosis, and treatment research.


More about COSMIC: COSMIC: a curated database of somatic variants and clinical data for cancer - PMC


Application Process

Please Apply By Submitting Your CV And a Supporting Cover Letter. Your Cover Letter Should Clearly Cover The Following Three Areas



  • Your background and previous related experience
  • Your relevant skills and how they align with the role
  • Your motivation for applying for this position

Applications without a cover letter addressing all three points will not be considered


Salary per annum: £37,470 - £43,500


Role Profile


Informatician/ Data Scientist job family: Grade 3


Contract Type: Permanent


Recruitment Process: Shortlisting w/c 9th Feb, Interviews w/c 16th February or w/c 2nd March


Closing date: 8th February 2026


Hybrid Working

Hybrid Working at Wellcome Sanger: We recognise that there are many benefits to Hybrid Working; including an improved work-life balance, with more focused time, as well as the ability to organise working time so that collaborative opportunities and team discussions are facilitated on campus. The hybrid working arrangement will vary for different roles and teams.


Equality, Diversity And Inclusion

We aim to attract, recruit, retain and develop talent from the widest possible talent pool, thereby gaining insight and access to different markets to generate a greater impact on the world. We have a supportive culture with the following staff networks, LGBTQ+, Parents and Carers, Disability and Race Equity to bring people together to share experiences, offer specific support and development opportunities and raise awareness. We want our people to be whoever they want to be because we believe people who bring their best selves to work, do their best work. That’s why we’re committed to creating a truly inclusive culture at Sanger Institute. We will consider all individuals without discrimination and are committed to creating an inclusive environment for all employees, where everyone can thrive.


Our Benefits

We are proud to deliver an awarding campus-wide employee wellbeing strategy and programme. The importance of good health and adopting a healthier lifestyle and the commitment to reduce work-related stress is strongly acknowledged and recognised at Sanger Institute.


Sanger Institute became a signatory of the International Technician Commitment initiative In March 2018. The Technician Commitment aims to empower and ensure visibility, recognition, career development and sustainability for technicians working in higher education and research, across all disciplines.


#J-18808-Ljbffr

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

New Machine Learning Employers to Watch in 2026: UK and Global Companies Driving ML Innovation

Machine learning (ML) has transitioned from a specialised field into a core business capability. In 2026, organisations across healthcare, finance, robotics, autonomous systems, natural language processing, and analytics are expanding their machine learning teams to build scalable intelligent products and services. For professionals exploring opportunities on www.MachineLearningJobs.co.uk , understanding the companies that are scaling, winning investment, or securing high‑impact contracts is crucial. This article highlights the new and high‑growth machine learning employers to watch in 2026, focusing on UK innovators, international firms with significant UK presence, and global platforms investing in machine learning talent locally.

How Many Machine Learning Tools Do You Need to Know to Get a Machine Learning Job?

Machine learning is one of the most exciting and rapidly growing areas of tech. But for job seekers it can also feel like a maze of tools, frameworks and platforms. One job advert wants TensorFlow and Keras. Another mentions PyTorch, scikit-learn and Spark. A third lists Mlflow, Docker, Kubernetes and more. With so many names out there, it’s easy to fall into the trap of thinking you must learn everything just to be competitive. Here’s the honest truth most machine learning hiring managers won’t say out loud: 👉 They don’t hire you because you know every tool. They hire you because you can solve real problems with the tools you know. Tools are important — no doubt — but context, judgement and outcomes matter far more. So how many machine learning tools do you actually need to know to get a job? For most job seekers, the real number is far smaller than you think — and more logically grouped. This guide breaks down exactly what employers expect, which tools are core, which are role-specific, and how to structure your learning for real career results.

What Hiring Managers Look for First in Machine Learning Job Applications (UK Guide)

Whether you’re applying for machine learning engineer, applied scientist, research scientist, ML Ops or data scientist roles, hiring managers scan applications quickly — often making decisions before they’ve read beyond the top third of your CV. In the competitive UK market, it’s not enough to list skills. You must send clear signals of relevance, delivery, impact, reasoning and readiness for production — and do it within the first few lines of your CV or portfolio. This guide walks you through exactly what hiring managers look for first in machine learning applications, how they evaluate CVs and portfolios, and what you can do to improve your chances of getting shortlisted at every stage — from your CV and LinkedIn profile to your cover letter and project portfolio.