Jobs

Senior AI/HPC Storage Engineer


Job details
  • Recursion
  • London
  • 1 week ago

Your work will change lives. Including your own.

Please make an application promptly if you are a good match for this role due to high levels of interest.The Impact You'll MakeRecursion is a pioneering TechBio company that leverages AI and machine learning to decode biology and accelerate drug discovery, with data as a key differentiator and value driver. We are seeking a Senior AI/HPC Storage Engineer to join our innovative team. In this role, you will be instrumental in designing, implementing, and managing advanced AI/HPC data systems that propel our groundbreaking drug discovery research.You will leverage your expertise in infrastructure solutions for Science to ensure the performance, scalability, and reliability of our storage systems. Your work will involve creating and maintaining robust infrastructure, automating processes, and optimizing storage systems to handle massive amounts of data and complex computational workloads, while ensuring high data integrity. In this role:You will be responsible for designing, implementing, testing, maintaining, and optimizing our data storage infrastructure and services, utilizing an Infrastructure as Code approach across both on-premises and public cloud environments.Your leadership and technical expertise will be key in driving innovation across all storage tiers within our AI/HPC infrastructure, ensuring we deliver a scalable and effective data platform to support our mission.By developing scripts and workflows, you will automate and verify storage infrastructure provisioning and dynamic reconfiguration, enhancing support for our AI/HPC storage environments.Your meticulous attention to detail will be crucial for performance analysis, benchmarking, troubleshooting and fine-tuning of our data storage systems and services, while efficiently managing user tickets.Your role also includes researching, deploying, and optimizing accessibility, performance, security, and data lifecycle management policies.Regular assessments of our storage platforms' health and operational performance against established metrics will be part of your responsibilities, with a focus on meeting and exceeding operational service level objectives.Finally, as a lead in technical communication and customer collaboration, your efforts will ensure high levels of customer satisfaction.Location:This position is based at our headquarters in Salt Lake City, Utah, or in our offices in Toronto, Canada, or London, United Kingdom. We may also consider a hybrid working arrangement. We ask that hybrid employees commit to regular on-site visits for routine work and departmental events.The Team You'll JoinAs a Senior AI/HPC Storage Engineer, you will be a part of our dedicated HPC Engineering and Operations team, reporting directly to the Director. This dynamic team includes 3 experienced Engineers, and with the addition of this role, you'll be part of an empowered, cross-functional unit.Our HPC team works in a fast-paced, collaborative environment, handling a broad spectrum of Scientific Infrastructure projects. These range from developing advanced, scalable infrastructure to deploying and managing AI/HPC resources and automating operational processes. The team also plays a crucial role in the curation of our vast data platform, which caters to a diverse set of professionals, including biologists, data scientists, and automation engineers.We're home to BioHive, the industry's most powerful supercomputer and our HPC team is constantly pushing the boundaries in the field of supercomputing in the TechBio industry. As part of this team, you will collaborate on projects that streamline and optimize our machine learning workflows and scientific computing tasks, driving efficient and transformative solutions. This is a unique opportunity to join a team that thrives on innovation, collaboration, and inclusivity in a role that is pivotal to our mission.The Experience You'll NeedA minimum of 7 years of experience in managing data storage infrastructure, preferably within global BioPharma organizations.In-depth knowledge of distributed/parallel file systems (IBM Storage Scale GPFS), multi-tier file (NAS), hybrid object storage (MinIO), and storage access and data transfer networking protocols.Experience with RDMA-capable high-speed networking.Extensive experience designing, deploying, testing, supporting, and troubleshooting complex Linux-based computing and data storage environments.Python programming and Bash scripting experience. In-depth hands-on experience in provisioning, configuring, and managing infrastructure through modern CI/CD techniques, GitOps, Infrastructure as Code (IaC) and cloud automation principles.Solid experience with software-defined infrastructure and cloud computing platforms, including Kubernetes, GCP, AWS, and others.Practical knowledge of resource management and job scheduling using Slurm and Kubernetes. Knowledge of container technologies like Apptainer and Docker.Strong verbal and written communication skills for effective documentation and collaboration.Prior experience mentoring, guiding, and cross-training team members.How You'll be SupportedThe Onboarding process will include peer knowledge transfer sessions, introductions to key stakeholders, and comprehensive exposure to our company culture and processes.You'll have the chance to learn from your colleagues during our regular lunch & learn and tech talk sessions.We offer the opportunity to attend courses for certification in new skills or technologies relevant to your role.If you're keen to hone your leadership skills, you'll have the option to participate in our coaching sessions like BetterUp.To ensure you're always at the forefront of your field, we offer the opportunity to attend conferences.The Values That We Hope You Share:We Care:

We care about our drug candidates, our Recursionauts, their families, each other, our communities, the patients we aim to serve and their loved ones. We also care about our work.We Learn:

Learning from the diverse perspectives of our fellow Recursionauts, and from failure, is an essential part of how we make progress.We Deliver:

We are unapologetic that our expectations for delivery are extraordinarily high. There is urgency to our existence: we sprint at maximum engagement, making time and space to recover.Act Boldly with Integrity:

No company changes the world or reinvents an industry without being bold. It must be balanced; not by timidity, but by doing the right thing even when no one is looking.We are One Recursion:

We operate with a 'company first, team second' mentality. Our success comes from working as one interdisciplinary team.Recursion is an Equal Opportunity Employer that values diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other characteristic protected under applicable federal, state, local, or provincial human rights legislation.

#J-18808-Ljbffr

Sign up for our newsletter

The latest news, articles, and resources, sent to your inbox weekly.

Similar Jobs

Senior Professional Architectural Engineer

Job Description:At DXC Technology, delivering excellence for our customers and colleagues is more than just a motto, it’s something we strive towards constantly through our work. Every day we deliver mission critical services in a secure environment whilst promoting our people first agenda, a real sense of community and a...

0201 CSC Computer Sciences Ltd Milton Keynes

Director, Sales (UKI)

Director, Sales (UKI)Date:Oct 4, 2024Location:United Kingdom - ReadingCompany:Super Micro ComputerJob Req ID: 25353About Supermicro:Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are amongst the fastest growing company among...

Supermicro Reading

Head of Research Computing Platforms

Wehave an exciting opportunityavailable for a Head of Research Computing Platforms to join one of the world's leading research Institutes at a crucial time in its evolution, and play a definitive role in shaping it for the future. You will join us on a full time , permanent basis, and...

The Francis Crick Institute Marylebone

Head of Research Computing Platforms

Wehave an exciting opportunityavailable for a Head of Research Computing Platforms to join one of the world's leading research Institutes at a crucial time in its evolution, and play a definitive role in shaping it for the future. You will join us on a full time , permanent basis, and...

The Francis Crick Institute London

Head of Research Computing Platforms

We have an exciting opportunity available for a Head of Research Computing Platforms to join one of the world’s leading research Institutes at a crucial time in its evolution, and play a definitive role in shaping it for the future. You will join us on a full time, permanent basis, ...

St. Pancras and Somers Town