Jobs

DevOps Engineer


Job details
  • microTECH Global LTD
  • London
  • 1 week ago

Job Title: DevOps Engineer

Job Type: Fixed Term Contract

Location: Kings Cross, London, United Kingdom

Duration: 12 Months (Extension Possible)

Budget: £70,000 - £90,000/year


100% On-Site Required // No Sponsorship Available


Our client are a global telecommunication company within their AI Infrastructure Team.


Brief:

We are looking for a highly skilled Senior DevOps Engineer to manage a large-scale AI development and training infrastructure.

The role involves overseeing GPU servers, Kubernetes clusters (Rancher), and storage systems to ensure seamless operations and optimized performance. You will collaborate with development teams, ensuring they have the resources and support needed to run their projects efficiently.

This is a critical technical position requiring expertise in Kubernetes, hardware management, automation


Responsibilities:

Kubernetes and Rancher Management: Configure, scale, and maintain Kubernetes clusters and Rancher for multi-cluster management, ensuring optimal performance and resource allocation.

GPU Resource Management: Manage GPU resources and servers, ensuring efficient resource scheduling, load balancing, and performance optimization for AI workloads.

Storage Management: Maintain and optimize large storage systems, ensuring high availability, performance, and data persistence.

DevOps and Automation: Implement CI/CD pipelines and automate infrastructure management using tools such as Terraform, Ansible, Jenkins, and GitLab CI.

Monitoring and Troubleshooting: Set up and manage monitoring and logging systems (e.g., Prometheus, Grafana, ELK) to ensure high availability and rapid issue resolution.

AI Framework Optimization: Collaborate with data scientists and AI developers to optimize AI frameworks (e.g., TensorFlow, PyTorch) for GPU and cluster environments.

Security and Access Management: Implement and manage role-based access control (RBAC) and ensure data security, encryption, and backup procedures are in place.


Key Requirements:

Proven experience in managing large-scale Kubernetes clusters and containerisation technologies (e.g., Docker).

Strong understanding of GPU resource management and optimization for AI workloads.

Expertise in managing large storage systems and implementing data persistence strategies.

Proficiency in scripting and automation (Python, Bash, Go), with experience in infrastructure as code (IaC) using Terraform, Ansible, or similar tools.

Familiarity with deep learning frameworks (e.g., TensorFlow, PyTorch) and experience optimizing them for large-scale environments.

Experience with monitoring and logging tools such as Prometheus, Grafana, and ELK.


Desirables:

Experience with Rancher or other Kubernetes management platform

Experience in managing hybrid cloud environments

Preferred Red Hat Certified System Administrator (RHCSA)

Preferred Certified Kubernetes Administrator (CKA)

Preferred Mandarin Speaker.


Please get in touch with to hear more about this incredible position.

Sign up for our newsletter

The latest news, articles, and resources, sent to your inbox weekly.

Similar Jobs

Devops Engineer

Development Operations EngineerJob Overview:We are seeking a skilled Development Operations Engineer to join our dynamic team. As a Development Operations Engineer, you will play a crucial role in developing, maintaining and enhancing new technologies that will have huge positive environmental impacts.The role is offered with a salary of £50 -...

Hale, Borough of Halton

DevOps Engineer

Job Title: DevOps EngineerJob Type: Fixed Term ContractLocation: Kings Cross, London, United KingdomDuration: 12 Months (Extension Possible)Budget: £70,000 - £90,000/year100% On-Site Required // No Sponsorship AvailableOur client are a global telecommunication company within their AI Infrastructure Team.Brief:We are looking for a highly skilled Senior DevOps Engineer to manage a large-scale...

microTECH Global LTD London

DevOps Engineer

DevOps EngineerWe are seeking a DevOps Engineer to join a blockchain intelligence team helping regulators and law-enforcement fight crypto criminals with real-time analysis.This team is a spin off from a Blockchain as a service giant. They aim to aid in the fight against crypto money laundering at scale, by helping...

Understanding Recruitment London

DevOps Engineer

Position: DevOps Engineer – Global Marketing Data PlatformContract: 1 YearLocation: London, 3 days onsite per weekWe are looking for a skilled DevOps Engineer to join a global project focused on building a centralized Master Data repository for client profiles and marketing preferences. This project aims to consolidate data from multiple...

Airswift London

DevOps Engineer

Job Title:DevOps Engineer(interest in ML a plus) Location:Remote (UK) Salary:up to £75,000 + equity.Role: We are currently seeking a skilled engineer to join our team and contribute to the build, deployment, and machine learning pipelines. This role is an excellent opportunity for individuals with DevOps experience looking to explore machine...

Third Republic London

Devops Engineer - Machine Learning

At CoMind, we are developing a non-invasive neuromonitoring technology that will result in a new era of clinical brain monitoring. In joining us, you will be helping to create cutting-edge technologies that will improve how we diagnose and treat brain disorders, ultimately improving and saving the lives of patients across...

Crane Venture Partners London