Engineer the Quantum RevolutionYour expertise can help us shape the future of quantum computing at Oxford Ionics.

View Open Roles

Machine Learning Evaluation Engineer

writewithmarker
London
1 month ago
Applications closed

Related Jobs

View all jobs

Lead Data Scientist

Machine Learning Engineer - GenAI - Hybrid

Staff Machine Learning Engineer

Senior Machine Learning Engineer

Machine Learning Applied Scientist (Machine Learning Observability & Governance)

Machine Learning Applied Scientist (Machine Learning Observability & Governance)

AI Evaluation, Research Methods, Python, LLMObservability

Salary range

£60,000-£80,000 p.a. + equity, depending on experience (up to £100,000 forcandidates with exceptional relevant experience)

Apply

Email us at and tell us a little bit about yourselfand your interest in the future of writing, along with your CV or a link to your CV site.

What is Marker?

Marker is an AI-native Word Processor – a reimagining of Google Docs and Microsoft Word.

Join us in building the next generation of agentic AI assistants supporting serious writers in their work.

We are a small, ambitious company using cutting-edge technology to give everybody writing superpowers.

What you'll do at Marker

We are looking for someone with a couple of years experience in academia or industry who can help us bringrigour and insight to our AI systems through evaluation,research, and observability. You'll work directly with Ryan Bowman (CPO) to help us understand and improvehow our AI assists writers. Here are some examples of areas you will be working in:

  • Design and implement evaluation frameworks for complex, subjective AI outputs (like writing feedbackthat's meant to inspire rather than just correct)
  • Build flexible evaluation pipelines that can assess quality across multiple dimensions - from humanpreference to actual writing improvement
  • Research and prototype new evaluation methodologies for creative and subjective AI tasks
  • Collaborate with our engineering team to integrate evaluation insights into our development process
  • Help define what "quality" means for different AI outputs and create metrics that actually matter forour users
  • Work on challenging problems like: "How do we automatically evaluate whether an AI comment successfullyencourages thoughtful revision?"

What we can offer

  • A calm, human-friendly work environment among kind and experienced professionals
  • Fun, creative, novel, and interesting technical work at the intersection of AI research and productdevelopment
  • An opportunity to work with and learn about the latest advancements in AI evaluation and language models
  • Direct collaboration with leadership to shape how we understand and improve our AI systems
  • As much responsibility and growth opportunities as you want to take on

Are you a good fit for this role?

In order to be successful in this role, you will recognise yourself in the following:

  • You have experience with AI/ML evaluation methodologies and can speak the language of AI research
  • You've worked hands-on with language models and understand the challenges of evaluating subjective,creative outputs
  • You are a self-starter willing to work independently and at speed - we imagine a 2-week experimentcadence at most.
  • You are familiar with and have worked on related technical systems (evaluation pipelines, datacollection tools) but don't need to be a full-stack engineer. You won't be expected to build these alone!
  • You think critically about what metrics actually matter and aren't satisfied with vanity metrics
  • You're comfortable working with ambiguous problems where the "right answer" isn't obvious
  • You have some programming experience (Python preferred) and can work independently on technical projects
  • You're interested in the intersection of AI capabilities and human creativity

An exceptional candidate for this role would be able to demonstrate some of thefollowing:

  • Experience building evaluation systems for generative AI in production environments
  • Knowledge of TypeScript and ability to integrate with our existing systems
  • Background in human-computer interaction, computational creativity, or writing research
  • Experience with A/B testing, statistical analysis, and experimental design
  • Familiarity with modern AI observability and monitoring tools
  • Published research or deep interest in AI evaluation methodologies
  • Interest in writing (fiction, non-fiction, essays)

However, you are NOT expected to:

  • Be a senior software engineer - we're looking for someone who can build evaluation systems, notarchitect our entire backend
  • Have solved every evaluation problem before - this is cutting-edge work and we're figuring it outtogether
  • Be experienced with every library in our stack from day one - you'll work closely with Ryan and ourengineering team
  • Have a specific degree - we value practical experience and research ability over credentials

Our stack

You'll be working with the following technologies:

  • Our AI engine uses a range of models, including self-hosted and fine-tuned open source models, as wellas latest reasoning models from Anthropic and OpenAI
  • Evaluation and research tools built primarily in Python, with integration into our TypeScriptinfrastructure
  • Our agentic AI execution platform is written in TypeScript, hosted on Cloudflare Workers
  • Standard ML tooling: various evaluation frameworks, data analysis tools, and monitoring systems
  • Our text editor frontend is a web application built with React, TypeScript and ProseMirror

Apply now!

Interested? Email us at with your CV (or a link to your CV site).Tell us a little bit about yourself and why you'd like to work at Marker!

Please note that this role is currently only available based in ourLondon hub, and at this time we are not able to sponsor work visas in the UK.


#J-18808-Ljbffr

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Seasonal Hiring Peaks for Machine Learning Jobs: The Best Months to Apply & Why

The UK's machine learning sector has evolved into one of Europe's most intellectually stimulating and financially rewarding technology markets, with roles spanning from junior ML engineers to principal machine learning scientists and heads of artificial intelligence research. With machine learning positions commanding salaries from £32,000 for graduate ML engineers to £160,000+ for senior principal scientists, understanding when organisations actively recruit can dramatically accelerate your career progression in this pioneering and rapidly evolving field. Unlike traditional software engineering roles, machine learning hiring follows distinct patterns influenced by AI research cycles, model development timelines, and algorithmic innovation schedules. The sector's unique combination of mathematical rigour, computational complexity, and real-world application requirements creates predictable hiring windows that strategic professionals can leverage to advance their careers in developing tomorrow's intelligent systems. This comprehensive guide explores the optimal timing for machine learning job applications in the UK, examining how enterprise AI strategies, academic research cycles, and deep learning initiatives influence recruitment patterns, and why strategic timing can determine whether you join a groundbreaking AI research team or miss the opportunity to develop the next generation of machine learning algorithms.

Pre-Employment Checks for Machine Learning Jobs: DBS, References & Right-to-Work and more Explained

Pre-employment screening in machine learning reflects the discipline's unique position at the intersection of artificial intelligence research, algorithmic decision-making, and transformative business automation. Machine learning professionals often have privileged access to proprietary datasets, cutting-edge algorithms, and strategic AI systems that form the foundation of organizational competitive advantage and automated decision-making capabilities. The machine learning industry operates within complex regulatory frameworks spanning AI governance directives, algorithmic accountability requirements, and emerging ML ethics regulations. Machine learning specialists must demonstrate not only technical competence in model development and deployment but also deep understanding of algorithmic fairness, AI safety principles, and the societal implications of automated decision-making at scale. Modern machine learning roles frequently involve developing systems that impact hiring decisions, financial services, healthcare diagnostics, and autonomous operations across multiple regulatory jurisdictions and ethical frameworks simultaneously. The combination of algorithmic influence, predictive capabilities, and automated decision-making authority makes thorough candidate verification essential for maintaining compliance, fairness, and public trust in AI-powered systems.

Why Now Is the Perfect Time to Launch Your Career in Machine Learning: The UK's Intelligence Revolution

The United Kingdom stands at the epicentre of a machine learning revolution that's fundamentally transforming how we solve problems, deliver services, and unlock insights from data at unprecedented scale. From the AI-powered diagnostic systems revolutionising healthcare in Manchester to the algorithmic trading platforms driving London's financial markets, Britain's embrace of intelligent systems has created an extraordinary demand for skilled machine learning professionals that dramatically exceeds the current talent supply. If you've been seeking a career at the forefront of technological innovation or looking to position yourself in one of the most impactful sectors of the digital economy, machine learning represents an exceptional opportunity. The convergence of abundant data availability, computational power accessibility, advanced algorithmic development, and enterprise AI adoption has created perfect conditions for machine learning career success.