MLOps Data Engineer
Hybrid 2/3 days London
Outside IR/35
Were looking for a Data Engineer with strong MLOps ownershipsomeone who builds reliable data pipelines and designs, runs, and improves ML pipelines in production. You wont be training models day-to-day like a Data Scientist; instead, youll enable Data Science by delivering high-quality datasets, reproducible training pipelines, robust deployments, and monitoring that keeps ML systems healthy and trustworthy.
What youll do Design, build, and operate scalable data pipelines for ingestion, transformation, and distribution
Develop and maintain ML pipelines end-to-end: data preparation, feature generation, training orchestration, packaging, deployment, and retraining
Partner closely with Data Scientists to productionize models: standardise workflows, ensure reproducibility, and reduce time-to-production
Build and maintain MLOps automation: CI/CD for ML, environment management, artefact handling, versioning of data/models/code
Implement observability for ML systems: monitoring, alerting, logging, dashboards, and incident response for data + model health
Establish best practices for data quality and ML quality: validation checks, pipeline tests, lineage, documentation, and SLAs/SLOs
Optimise cost and performance across data processing and training workflows (e.g., Spark tuning, BigQuery optimisation, compute autoscaling)
Ensure secure, compliant handling of data and models, including access controls, auditability, and governance practices
What makes you a great fit 4+ years of experience as a Data Engineer (or ML Platform / MLOps Engineer with strong DE foundations) shipping production pipelines
Strong Python and SQL skills; ability to write maintainable, testable, production-grade code
Solid understanding of MLOps fundamentals: model lifecycle, reproducibility, deployment patterns, and monitoring needs
Hands-on experience with orchestration and distributed processing in a cloud environment
Experience with data modelling and ETL/ELT patterns; ability to deliver analysis-ready datasets
Familiarity with containerization and deployment workflows (Docker, CI/CD, basic Kubernetes/serverless concepts)
Strong GCP experience and services such as Vertex, BigQuery, Composer, Dataproc, Cloud Run, Dataplex, Cloud Storage/or at least one major cloud provider, GCP, AWS, Azure
Strong troubleshooting mindset: ability to debug issues across data, infra, pipelines, and deployments
Nice to have / big advantage Experience with ML tooling such as MLflow (tracking/registry), Vertex AI / SageMaker / Azure ML, or similar platforms
Experience building and maintaining feature stores (e.g., Feast, Vertex Feature Store)
Experience with data/model validation tools (e.g., Great Expectations, TensorFlow Data Validation, Evidently)
Knowledge of model monitoring concepts: drift, data quality issues, performance degradation, bias checks, and alerting strategies
Infrastructure-as-Code (Terraform) and secrets management / IAM best practices
Familiarity with governance/compliance standards and audit requirements
TPBN1_UKTJ