ML Compiler Engineer and ML Runtime Engineer Jobs UK: Salaries, Skills and How to Break In (2026 Guide)
Published by machinelearningjobs.co.uk — the UK's specialist job board for Machine Learning Engineers, ML Scientists and AI Researchers. This guide was written by the machinelearningjobs.co.uk editorial team and is updated for 2026.
The Short Answer
ML Compiler Engineer and ML Runtime Engineer are two of the most in-demand and least understood roles in the UK machine learning job market. ML Compiler Engineers build the software toolchains that translate high-level ML models into optimised instructions for specific hardware — working with frameworks like MLIR, TVM and XLA. ML Runtime Engineers build and maintain the execution environments that run those models efficiently at scale, working across CUDA, ROCm, Metal and custom silicon. Both roles sit at the intersection of systems programming and machine learning, command salaries of £80,000–£140,000 in the UK, and are being actively hired for by companies including Fractile, Arm, Graphcore and Qualcomm. Neither role typically requires a machine learning research background — strong systems programming and compiler engineering skills are the primary entry point.
If you have been searching for machine learning jobs in the UK and noticed roles with titles like ML Compiler Engineer, ML Runtime Engineer or Machine Learning Systems Engineer, you are looking at one of the most specialised — and most undersupplied — corners of the UK AI jobs market.
These are not research roles, and they are not standard ML engineering positions. They exist at the boundary of hardware and software, where the performance of AI systems is won or lost. Understanding what they involve, what skills they require, and which companies are hiring is the first step to positioning yourself for one of the most financially rewarding technical career paths in UK machine learning.
This guide covers both roles in full — what they are, how they differ, what tooling and skills employers actually want, what they pay, and how to break in from adjacent engineering disciplines.
What Is an ML Compiler Engineer?
An ML Compiler Engineer builds the software infrastructure that takes a machine learning model — written in PyTorch, JAX, TensorFlow or a similar framework — and translates it into optimised low-level code that runs efficiently on a specific hardware target. That target might be a GPU, a CPU, a custom accelerator chip, or a neuromorphic processor.
The job is, at its core, compiler engineering applied to the specific constraints and characteristics of neural network computation. Where a traditional compiler takes general-purpose code and produces efficient machine instructions, an ML compiler takes computation graphs — the mathematical operations that constitute a neural network — and applies a sequence of transformations and optimisations to produce fast, hardware-aware execution code.
In practice, ML Compiler Engineers work on problems like:
Graph optimisation: Fusing operations, eliminating redundant computation, and restructuring the execution graph to minimise memory movement and maximise hardware utilisation
Code generation: Producing efficient low-level code (PTX for NVIDIA GPUs, SPIR-V for Vulkan, LLVM IR for CPUs) from optimised computation graphs
Quantisation and precision management: Implementing mixed-precision inference pipelines that trade numerical precision for speed and memory efficiency
Hardware back-end development: Building and maintaining compiler back-ends that target specific accelerator architectures — including proprietary silicon where public documentation is limited
The dominant infrastructure in this space is MLIR (Multi-Level Intermediate Representation), the compiler framework developed by Google and now maintained as an LLVM sub-project. Familiarity with MLIR is the most consistently requested technical requirement in UK ML compiler job adverts. Apache TVM and OpenXLA/XLA are the other major frameworks — XLA being the compiler underlying JAX and TensorFlow, TVM being the open-source cross-platform ML compiler most widely used in academic and edge deployment research.
What Is an ML Runtime Engineer?
An ML Runtime Engineer builds and maintains the execution environment — the runtime — that sits between the ML model and the hardware it runs on. Where the compiler produces optimised code, the runtime is responsible for executing it correctly and efficiently: managing memory allocation, scheduling operations across compute units, handling asynchronous execution, and ensuring that models run at the performance the compiler promised.
In production ML systems, runtime engineering is where theoretical performance becomes real-world throughput. A model might be well-compiled, but if the runtime is inefficient at managing GPU memory, handling batching, or orchestrating multi-device execution, the system will underperform regardless of how good the compiler is.
ML Runtime Engineers work on:
Memory management: Designing allocation strategies that minimise fragmentation and data movement across the memory hierarchy (HBM, SRAM, host DRAM)
Kernel scheduling: Orchestrating the execution of low-level compute kernels across multiple cores, streams or devices to maximise utilisation
Multi-device and distributed execution: Enabling models to run across multiple GPUs, TPUs or custom accelerators, including handling communication and synchronisation between devices
Inference serving infrastructure: Building the low-level components of inference serving systems — batching logic, request queuing, latency-aware scheduling — that sit beneath higher-level serving frameworks like Triton or TorchServe
The dominant execution environment for ML workloads remains NVIDIA CUDA, and deep CUDA programming experience is the most common hard requirement in UK ML Runtime Engineer job adverts. AMD ROCm is increasingly relevant as organisations diversify away from NVIDIA hardware. Apple Metal matters for on-device inference on Apple silicon. For companies building custom silicon — an increasingly active area in the UK — runtime engineers work directly against proprietary APIs and hardware specifications.
How Do the Two Roles Differ?
The clearest way to think about the distinction is this: the ML compiler produces the code; the ML runtime executes it. In practice, the roles overlap significantly — particularly at smaller companies and deep tech startups where one engineer may own the full stack from graph optimisation through to kernel execution. At larger organisations, the roles are more separated, with dedicated compiler and runtime teams that collaborate closely.
The skill profile differs at the margins. ML Compiler Engineers typically have stronger backgrounds in formal compiler theory, LLVM infrastructure, and intermediate representation design. ML Runtime Engineers tend to have deeper GPU programming experience, stronger backgrounds in systems programming for latency-sensitive workloads, and more experience with profiling and performance debugging at the hardware level.
Both roles require strong C++ and Python, a solid understanding of computer architecture, and the ability to reason about performance at multiple levels of the stack simultaneously.
Skills UK Employers Are Actually Hiring For
Based on live job adverts across the UK machine learning jobs market in 2026, the skills most consistently requested for ML Compiler and ML Runtime roles are:
Core programming: C++ (essential for both roles), Python (essential), CUDA (essential for runtime, strongly preferred for compiler)
Compiler frameworks: MLIR (most in-demand), LLVM, Apache TVM, OpenXLA/XLA, Halide
Runtime and execution: CUDA, ROCm, Metal, Triton (OpenAI's GPU programming language), custom accelerator SDKs
ML framework internals: PyTorch internals / TorchDynamo / torch.compile, JAX, TensorFlow graph execution
Hardware and architecture: GPU architecture (NVIDIA Hopper, Ampere; AMD RDNA/CDNA), CPU vectorisation (AVX-512, ARM SVE), custom accelerator architectures, memory hierarchy design
Desirable additional skills: Formal verification, hardware description languages (for roles at silicon companies), distributed systems, profiling tools (NVIDIA Nsight, AMD ROCm Profiler, VTune)
A machine learning research background is explicitly not required for either role in the majority of UK job adverts — employers consistently prioritise systems programming depth over ML theory knowledge.
ML Compiler and ML Runtime Engineer Salaries in the UK
Both roles command a significant premium over standard ML engineering salaries, reflecting the scarcity of qualified candidates and the critical nature of the work.
ML Compiler Engineer:
Mid-level (3–6 years): £80,000–£110,000 base
Senior (6+ years): £110,000–£140,000 base
Principal / Staff: £140,000–£170,000+ base, often with equity
ML Runtime Engineer:
Mid-level (3–6 years): £75,000–£105,000 base
Senior (6+ years): £105,000–£135,000 base
Principal / Staff: £135,000–£160,000+ base, often with equity
Contract and day-rate work in both specialisms is increasingly available as organisations run time-limited inference optimisation programmes. Senior contractor day rates for ML Compiler and Runtime engineers with MLIR or CUDA expertise typically range from £700–£1,100 per day depending on specialisation and location.
London commands a premium of approximately 10–15% over equivalent roles in Oxford, Cambridge, Bristol and Edinburgh — though the latter three cities have growing concentrations of ML systems hiring driven by spinout activity from their university research ecosystems.
Companies Hiring ML Compiler and ML Runtime Engineers in the UK
Fractile is one of the most exciting employers in this space in the UK right now. Fractile is building hardware and software for extremely fast, efficient LLM inference — working on the full stack from custom silicon through to the runtime and compiler layers that make LLMs run at unprecedented speed and efficiency. For ML Compiler and ML Runtime Engineers who want to work on genuinely novel systems at the frontier of what is possible with large model inference, Fractile represents one of the most technically challenging and consequential opportunities in the UK market. They are actively hiring for both Senior ML Compiler Engineers and Senior ML Runtime Engineers.
Arm is the bedrock of the UK semiconductor and ML systems ecosystem. Arm's ML group works on compiler and runtime infrastructure for neural network execution across Arm's vast installed base of processors — from mobile and edge devices through to server-class silicon. Arm's compiler teams work extensively with MLIR and TVM, and their runtime teams own the Arm Compute Library and related inference execution infrastructure.
Graphcore — now part of SoftBank — pioneered the Intelligence Processing Unit (IPU) architecture specifically for ML workloads. Their compiler and runtime teams have built an entirely custom software stack (the Poplar SDK and PopART framework) for IPU execution, making Graphcore one of the most technically demanding and rewarding environments for ML systems engineers in the UK.
Qualcomm has a significant UK engineering presence, particularly in Cambridge, working on ML compiler and runtime infrastructure for the Snapdragon and AI 100 accelerator product lines. Their compiler team works on MLIR-based toolchains for on-device inference.
Imagination Technologies, based in Kings Langley, develops GPU and neural network accelerator IP and hires ML compiler engineers for their PowerVR and IMG Series neural network SDK work.
Amazon Web Services (AWS) UK hires ML systems engineers for Trainium and Inferentia compiler and runtime work as part of the Annapurna Labs / Neuron SDK team — a growing programme with UK-based engineering headcount.
Beyond these named employers, a growing number of UK deep tech startups in the LLM inference, edge AI and custom silicon spaces are building ML compiler and runtime teams — many spun out of university research groups at Oxford, Cambridge, Edinburgh and Imperial.
How to Break Into ML Compiler or ML Runtime Engineering
Neither role has a single defined entry path, but the most common backgrounds are:
From compiler engineering: Software engineers with experience in LLVM, GCC or proprietary compiler stacks who develop ML framework knowledge. This is the most direct path to ML Compiler roles.
From HPC or GPU programming: Engineers with high-performance computing, CUDA or parallel systems backgrounds who develop ML execution knowledge. This is the most direct path to ML Runtime roles.
From ML framework engineering: Engineers who have worked on PyTorch, JAX or TensorFlow internals and want to go deeper into the systems layer.
From PhD research: Candidates with PhDs in compilers, computer architecture, programming languages or ML systems are highly sought after by companies like Fractile, Arm and Graphcore for senior and research-engineering roles.
The most effective way to demonstrate readiness for either role is through hands-on systems work: contributing to MLIR, TVM or PyTorch compiler/runtime components on GitHub; building and benchmarking custom CUDA kernels; or completing projects that involve optimising model execution on specific hardware. These are the signals UK employers in this space consistently prioritise over academic credentials alone.
Why These Roles Are So Hard to Fill
The UK pipeline of ML Compiler and ML Runtime engineers is genuinely constrained. University ML curricula overwhelmingly focus on model development, training techniques and applied ML — not on the systems layer. The engineers who end up in compiler and runtime roles typically arrive from adjacent fields rather than through direct training.
This supply constraint is structural, not cyclical. As UK companies scale LLM inference, deploy AI to edge devices, and develop custom silicon, the demand for ML systems engineers is growing faster than the pipeline can supply them. For engineers with the right background, that imbalance represents a sustained career and compensation advantage.
Finding ML Compiler and ML Runtime Jobs in the UK
The most effective way to find live ML Compiler Engineer and ML Runtime Engineer roles in the UK is through a specialist machine learning job board. machinelearningjobs.co.uk aggregates roles from across the UK ML market — including systems-layer roles at companies like Fractile, Arm, Graphcore and Qualcomm — and allows job alerts to be set for specific role types so you hear about new openings immediately.
Direct employer career pages for Fractile, Arm, Graphcore and Qualcomm are also worth monitoring, as ML systems roles at these companies sometimes go live without broad distribution.
Frequently Asked Questions: ML Compiler and ML Runtime Engineer Jobs UK
Do I need a machine learning background to become an ML Compiler Engineer? No. The majority of UK employers hiring for ML Compiler roles prioritise compiler engineering depth — particularly LLVM and MLIR experience — over ML research knowledge. A strong background in compiler construction, programming language implementation or systems programming is typically more valued than ML model training experience.
What is MLIR and why does it matter for ML Compiler jobs? MLIR (Multi-Level Intermediate Representation) is the dominant infrastructure framework for building ML compilers. Developed originally at Google and now an LLVM sub-project, MLIR provides a flexible, extensible IR framework that allows compiler authors to define custom abstractions for different hardware targets. It underpins compiler stacks at Google (XLA), Modular (Mojo/MAX), and a growing number of custom silicon companies. Proficiency with MLIR is the single most consistently requested technical skill in UK ML Compiler Engineer job adverts in 2026.
How much do ML Compiler Engineers earn in the UK? Mid-level ML Compiler Engineers in the UK typically earn between £80,000 and £110,000 base salary. Senior engineers earn £110,000–£140,000, with principal and staff-level roles at well-funded deep tech companies often exceeding £140,000 plus equity. Contract day rates for experienced ML Compiler engineers typically range from £700–£1,000 per day.
How much do ML Runtime Engineers earn in the UK? Mid-level ML Runtime Engineers typically earn £75,000–£105,000 base. Senior engineers earn £105,000–£135,000, with staff-level roles at leading companies reaching £160,000 or above with equity. CUDA expertise commands a consistent premium across the market.
What is the difference between an ML Runtime Engineer and an MLOps Engineer? These are distinct roles. MLOps Engineers focus on the infrastructure and tooling for training pipelines, model deployment, monitoring and CI/CD for ML systems — typically working at a higher level of abstraction using platforms like Kubernetes, Kubeflow or MLflow. ML Runtime Engineers work at the low-level systems layer: writing CUDA kernels, managing GPU memory allocation, and building the execution infrastructure that runs model inference at peak efficiency. The two roles rarely overlap.
Can I become an ML Runtime Engineer from a CUDA programming background? Yes — a strong CUDA programming background is one of the most valued entry points into ML Runtime Engineering. Employers look for candidates who can demonstrate deep understanding of GPU memory hierarchies, kernel optimisation, warp-level programming, and experience profiling and tuning GPU workloads. Supplementing CUDA experience with knowledge of ML model execution patterns (batching, KV cache management for transformers, quantisation) makes for a very competitive profile.
Which UK cities have the most ML Compiler and Runtime jobs? London has the highest concentration of live roles, followed by Cambridge (strong due to Arm, Qualcomm and deep tech spinouts), Oxford (Fractile and university spinout activity), Edinburgh (strong ML systems research community) and Bristol. Remote-friendly roles exist but are less common than in standard ML engineering, as the work often involves close collaboration with hardware teams and access to specialist compute infrastructure.
Where is the best place to search for ML Compiler and ML Runtime jobs in the UK? machinelearningjobs.co.uk is the UK's specialist job board for Machine Learning Engineers, ML Scientists and AI Researchers, and aggregates live roles including ML Compiler and ML Runtime positions from companies across the UK market. Setting up a job alert for these specific role types ensures you are notified as soon as relevant roles go live.
Summary
ML Compiler Engineer and ML Runtime Engineer are two of the most technically demanding, financially rewarding and undersupplied roles in the UK machine learning jobs market. They require deep systems programming expertise — particularly in MLIR, LLVM, CUDA and GPU architecture — rather than ML research backgrounds, and they sit at the critical layer of the stack where AI performance is determined. Companies including Fractile, Arm, Graphcore and Qualcomm are actively hiring in the UK, and the structural supply constraint in both roles means the career and compensation outlook for qualified engineers is exceptionally strong.
If you are ready to explore live roles, search ML Compiler Engineer and ML Runtime Engineer jobs at machinelearningjobs.co.uk — the UK's specialist job board for Machine Learning Engineers, ML Scientists and AI Researchers.