Do you want to help us improve human health and understand life on Earth? Make your mark by shaping the future to enable or deliver life-changing science to solve some of humanity’s greatest challenges.
About the Role
We are seeking a Senior Bioinformatician to support single-cell and spatial transcriptomic studies needed to train large-scale AI-based foundation models as part of a collaborative project with Open targets ( This role will aid multiple computational projects across the lab, focusing on curating large-scale cohorts of training data to train foundation models. The position involves working within an interdisciplinary team of life scientists, computer scientists, and ML scientists.
The post holder will maintain a key understanding of the research portfolio within the team and contribute to the development of innovative and creative data analysis strategies. They will also contribute to the development of the training pipeline and machine learning and artificial intelligence models. Additionally, they will have the opportunity to develop their skills in machine learning and AI models. Finally, they will be responsible for exploring opportunities to transition project-based analysis work into established scalable pipelines and advocate for open and reproducible science.
You will join an interdisciplinary team of life scientists, computer scientists, and mathematicians. We all learn from each other and work together in supporting Wellcome's mission to address worldwide health challenges. This is an exciting opportunity to work in single-cell and spatial genomics to understand how the immune system develops and maintains health.
Key responsibilities of the role:
- Help generate and lead data analysis, and interpret large single-cell genomics data relating to the Human Cell Atlas research generated by the group, shared by collaborators and publicly available, including multi-omic and spatial datasets (e.g. scRNAseq, scATAC, 10x Genomics Visium and Xenium).
- Implement innovative and creative data analysis strategies to aid in the pursuit of biological insights and clinical translation using single-cell research.
- Work in a team to answer research questions formulated together with the members of Haniffa Lab, and with team members across the wider Institute as appropriate. Contribute to problem-solving discussions to solve complicated or multifaceted problems and generate novel ideas and new approaches.
- Organise and keep track of developed codes and generated data within the group.
- Feedback results to project leaders when required to achieve deadlines.
- To prepare high-quality data reports and manuscripts to communicate results, including through publication and presentations at scientific meetings.
- To take a full part in the general duties of the team, and to pass on skills and knowledge to other team members and visitors. To take part in wider Sanger Institute activities as appropriate.
About You
You are a bioinformatician with experience analysing genomic data and interpreting biological data. You are seeking a senior role that includes elements of statistical analysis, data analysis and pipeline engineering. You are a highly detail-oriented problem solver and are eager to learn and develop in the areas in which you have less experience. Teamwork and collaboration are essential aspects of this role, and you will be enthusiastic about creating effective and productive working relationships with collaborators both within and outside the organisation.
You will be supported in your personal and professional development and have the opportunity to lead peer-reviewed publications on genetics and genomics approaches.
Essential Skills
- MSC or PhD in a relevant quantitative discipline (e.g. Computational Biology, Genetics, Bioinformatics, Physics, Engineering or Applied Statistics/Mathematics); or equivalent working experience in Bioinformatics or a related discipline
- Experience programming in high-level scripting languages (Preferably Python and/or R), and the Unix Shell
- Understanding and some experience in the fields of genomics, data processing and high-throughput data analysis
- Experience in single-cell genomics data analysis and integration, e.g. scRNAseq, scATAC (desirable)
- Experience developing statistical/machine-learning methods for the analysis of large-scale biological datasets, particularly Gaussian processes, neural networks and deep generative models (desirable)
- Extensive experience in a research environment with an established track record of productivity through publications, or writing reports in the industry
- Enthusiastic, proactive attitude and desire to learn
- Strong communication skills to allow efficient interactions with team members
- Commitment, problem-solving skills and attention to detail
- Ability to organise own workload and manage competing priorities
- Ability to summarise complex information and present to a wide variety of audiences in a concise and logical manner
- Demonstrates inclusivity and respect for all
Relevant publication of the groups:
- Lotfollahi, M., Naghipourfar, M., Luecken, M. D., Khajavi, M., Büttner, M., Wagenstetter, M., Avsec, Ž., Gayoso, A., Yosef, N., Interlandi, M. & Others. Mapping single-cell data to reference atlases by transfer learning.Nature Biotechnology1–10 .
- Lotfollahi, M., Wolf, F. A. & Theis, F. J. scGen predicts single-cell perturbation responses.Nature Methods16, 715–721 .
- Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A. V. & Theis, F. J. Biologically informed deep learning to query gene programs in single cell atlases.Nature Cell Biology .