National AI Awards 2025Discover AI's trailblazers! Join us to celebrate innovation and nominate industry leaders.

Nominate & Attend

principal engineer reliability engineering

London
3 weeks ago
Applications closed

Related Jobs

View all jobs

Principal Data Scientist - AI

Principal Geospatial Data Engineer

Principal Data Engineer

Lead Data Engineer

Lead Data Engineer

Machine Learning Ops Engineer - AI

Ready for a challenge? 

Then Just Eat Takeaway.com might be the place for you. We’re a leading global online food delivery platform, and our vision is to empower everyday convenience. 

Whether it’s a Friday-night feast, a post-gym poke bowl, or grabbing some groceries, our tech platform connects tens of millions of customers with hundreds of thousands of restaurant, grocery and convenience partners across the globe.

About this role

We are seeking a seasoned Principal Engineer to lead the design, development, and evolution of our Observability Platform, ensuring it meets the needs of our rapidly scaling systems and engineering teams. This role will also focus on leveraging Machine Learning (ML) and Artificial Intelligence (AI) to deliver advanced insights that proactively improve system health and drive down Mean Time to Detection (MTTD) and Mean Time to Resolution (MTTR). The ideal candidate will be a visionary technologist with deep expertise in observability, monitoring, and distributed systems, capable of driving strategy, architecture, and execution for a world-class platform.

These are some of the key ingredients to the role: 

Platform Leadership

Architect, design, and implement a cutting-edge Observability Platform to support metrics, logs, traces, and events at scale.

Integrate ML/AI-driven solutions to enhance anomaly detection, root cause analysis, and predictive insights.

Lead the development and adoption of platform capabilities to ensure system health, reliability, and performance.

Establish and evolve platform standards and best practices to align with the company’s overall engineering goals.

Strategic Initiatives

Collaborate with engineering teams to define the observability strategy, ensuring alignment with business and operational objectives.

Identify and integrate the latest observability technologies, including AI-based analytics, to improve system insights and developer productivity.

Drive a platform-first mindset, ensuring observability is treated as a foundational capability across all services.

Implement real-time insights and proactive monitoring powered by AI/ML to reduce detection and resolution times.

Operational Excellence

Ensure the Observability Platform is highly available, performant, and secure across all environments.

Optimize data collection, processing, and storage to balance performance with cost efficiency.

Define SLAs, SLOs, and SLIs for observability services to support reliability engineering practices.

Continuously improve MTTD and MTTR by leveraging advanced AI/ML models for predictive analysis and automated responses.

Mentorship and Collaboration

Act as a mentor and technical leader for engineers, fostering a culture of learning, innovation, and excellence.

Collaborate with stakeholders, including Site Reliability Engineering (SRE), infrastructure, and application teams, to gather requirements and deliver impactful solutions.

Advocate for observability as a critical enabler of operational success across the organization.

What will you bring to the table? 

Extensive Engineering Experience: Proven experience in building and scaling observability platforms in a cloud-native environment.

Observability Expertise: Deep understanding of observability pillars (metrics, logs, traces) and related tools (e.g., Prometheus, Grafana, OpenTelemetry, Jaeger, Kibana Elastic Stack).

AI/ML Proficiency: Hands-on experience integrating ML/AI models into observability systems to drive advanced insights, anomaly detection, and predictive analysis.

Distributed Systems Knowledge: Strong expertise in designing scalable and reliable systems for high-throughput data collection and processing.

Programming Skills: Proficiency in one or more languages (e.g., Go, Python, Java, Terraform, Pulumi) with a focus on building robust platforms.

Cloud Proficiency: Hands-on experience with cloud platforms (e.g., AWS, GCP, Azure) and Infrastructure-as-Code tools (e.g., Terraform, Pulumi).

Leadership and Mentorship: Experience leading and mentoring multicultural and cross functional, multicultural engineering teams, driving technical decisions, and delivering large-scale initiatives.

Cost Optimization: Familiarity with strategies for managing the costs associated with observability data storage, processing, and analysis.

Desirable Qualifications:

Expertise in applying AI/ML for proactive alerting, root cause analysis, and predictive scaling.

Experience with service mesh technologies (e.g., Istio, Linkerd) and their observability implications.

Contributions to open-source observability or ML/AI projects.

Proficiency with container technologies (Docker, Kubernetes) and best practices configurations their implications for observability and monitoring. 

Understanding of statistical analysis, data mining, and feature engineering techniques to extract meaningful insights from observability data.  

At JET, this is on the menu: 

Our teams forge connections internally and work with some of the best-known brands on the planet, giving us truly international impact in a dynamic environment. 

Fun, fast-paced and supportive, the JET culture is about movement, growth and about celebrating every aspect of our JETers. Thanks to them we stay one step ahead of the competition.

Inclusion, Diversity & Belonging 

No matter who you are, what you look like, who you love, or where you are from, you can find your place at Just Eat Takeaway.com. We’re committed to creating an inclusive culture, encouraging diversity of people and thinking, in which all employees feel they truly belong and can bring their most colourful selves to work every day. 

What else is cooking? 

Want to know more about our JETers, culture or company? Have a look at our where you can find people's stories, blogs, podcasts and more JET morsels

National AI Awards 2025

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Machine Learning Jobs Skills Radar 2026: Emerging Tools, Frameworks & Platforms to Learn Now

Machine learning is no longer confined to academic research—it's embedded in how UK companies detect fraud, recommend content, automate processes & forecast risk. But with model complexity rising and LLMs transforming workflows, employers are demanding new skills from machine learning professionals. Welcome to the Machine Learning Jobs Skills Radar 2026—your annual guide to the top languages, frameworks, platforms & tools shaping machine learning roles in the UK. Whether you're an aspiring ML engineer or a mid-career data scientist, this radar shows what to learn now to stay job-ready in 2026.

How to Find Hidden Machine Learning Jobs in the UK Using Professional Bodies like BCS, Turing Society & More

Machine learning (ML) continues to transform sectors across the UK—from fintech and retail to healthtech and autonomous systems. But while the demand for ML engineers, researchers, and applied scientists is growing, many of the best opportunities are never posted on traditional job boards. So, where do you find them? The answer lies in professional bodies, academic-industry networks, and tight-knit ML communities. In this guide, we’ll show you how to uncover hidden machine learning jobs in the UK by engaging with groups like the BCS (The Chartered Institute for IT), Turing Society, Alan Turing Institute, and others. We’ll explore how to use member directories, CPD events, SIGs (Special Interest Groups), and community projects to build connections, gain early access to job leads, and raise your professional profile in the ML ecosystem.

How to Get a Better Machine Learning Job After a Lay-Off or Redundancy

Redundancy in machine learning can feel especially frustrating when your role was technically advanced, strategically important, or AI-facing. But the UK still has strong demand for machine learning professionals across fintech, healthtech, retail, cybersecurity, autonomous systems, and generative AI. Whether you're a research-oriented ML engineer, production-focused MLOps developer, or applied scientist, this guide is designed to help you bounce back from redundancy and find a better opportunity that suits your goals.