National AI Awards 2025Discover AI's trailblazers! Join us to celebrate innovation and nominate industry leaders.

Nominate & Attend

Data Engineer

Ripjar
Cheltenham
1 week ago
Create job alert

About Ripjar

Ripjar is a UK based software company that uses data and machine learning technologies to help companies and governments prevent financial crimes and terrorism. For example, our software was helping many financial institutions and corporations comply with the recent addition of sanctions on Russian entities.

Ripjar originally span out from GCHQ and now has 130 staff based in Cheltenham and remotely and are beginning to expand globally. We have two successful, inter-related products; Labyrinth Screening and Labyrinth Intelligence. Labyrinth Screening allows companies to monitor their customers or suppliers for entities that they aren’t allowed to or do not want to do business with (for ethical or environmental reasons). Labyrinth Intelligence empowers organisation to perform deep investigations into varied datasets to find interesting patterns and relationships.

Data infuses everything Ripjar does. We work with a wide variety of datasets of all scales, including an always-growing archive of 8 billion news articles in (nearly!) every language in the world going back over 30 years, sanctions and watchlist data provided by governments, 250 million organisations and ownership data from global corporate registries.

About the Role

Ripjar has several engineering teams that are responsible for the processing infrastructure and many of the analytics that collect, organise, enrich and distribute this data. Central to almost all of Ripjar’s systems is the Data Collection Hub, which captures data from various sources, processes and analyses it, and then forwards it on to multiple end-user applications. The system is developed and maintained by 3 teams of software engineers, data engineers, and data scientists.

We are looking for an individual with a least 2 years industrial or commercial experience in data processing systems to come in and add to this team. Ripjar values engineers who are thoughtful and thorough problem solvers who are able to learn new technologies, ideas and paradigms quickly.

Technology Stack

The specific technical skills you possess aren’t as important to us as the ability to understand complex systems and get to the heart of problems. We do, however, expect you to be fluent in at least one programming language, have at least two years experience working with moderately complex software systems in production and have a curiosity and interest in learning more.

In this role, you will be using python (specifically pyspark) and Node.js for processing data, backed by various Hadoop stack technologies such as HDFS and HBase. MongoDB and Elasticsearch are used for indexing smaller datasets. Airflow & Nifi are used to co-ordinate the processing of data, while Jenkins, Jira, Confluence and Github are used as support tools. We use ansible to manage configuration and deployments. Most developers use Macbooks for development and our servers all run the CentOS flavour of Linux.

If you have any experience in this tech stack, that’s useful, as is a numerate degree, such as Computer Science, but neither are required for the role.

Responsibilities:

  • Contributing production quality code and unit-tests to our Data Collection Hub
  • Contributing improvements to the test and build pipelines
  • Considering the impact and implications of changes and communicating these clearly
  • Helping to support the data processing pipelines as needed
  • Modelling data in the best way for specific business needs
  • Staying abreast of the latest developments in Data Engineering to contribute to Ripjar’s best practices
  • Adding to Ripjar’s culture and make it a fun and rewarding place to work!

Requirements:

The specific technical skills you possess aren’t as important to us as the ability to understand complex systems and get to the heart of problems. We do, however, expect you to be fluent in at least one programming language, have at least two years experience working with moderately complex software systems in production and have a curiosity and interest in learning more.

  • You will be using Python (specifically pyspark) and Node.js for processing data
  • You will be using Hadoop stack technologies such as HDFS and HBase.
  • Experience using MongoDB and Elasticsearch for indexing smaller datasets would be beneficial.
  • Experience using Airflow & Nifi to co-ordinate the processing of data would be beneficial
  • You will be using Ansible to manage configuration and deployments.

Salary and Benefits

  • Salary DOE
  • 25 days annual leave + your birthday off, in addition to bank holidays, rising to 30 days after 5 years of service.
  • Remote working
  • Private Family Healthcare.
  • Employee Assistance Programme.
  • Company contributions to your pension.
  • Pension salary sacrifice.
  • Enhanced maternity/paternity pay.
  • The latest tech including a top of the range MacBook Pro.

Related Jobs

View all jobs

Data Engineer

Data Engineer

Data Engineer

Data Engineer

Data Engineer SAP {Defence, MoD

Data Engineer (AI-Driven platform. Python/Snowflake)Remote £70k

National AI Awards 2025

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Top 10 Mistakes Candidates Make When Applying for Machine-Learning Jobs—And How to Avoid Them

Landing a machine-learning job in the UK is competitive. Learn the 10 biggest mistakes applicants make—plus tested fixes, expert resources and live links that will help you secure your next ML role. Introduction From fintechs in London’s Square Mile to advanced-research hubs in Cambridge, demand for machine-learning talent is exploding. Job boards such as MachineLearningJobs.co.uk list new vacancies daily, and LinkedIn shows more than 10,000 open ML roles across the UK right now. Yet hiring managers still reject most CVs long before interview—often for avoidable errors. Below are the ten most common mistakes we see, each paired with a practical fix and a live resource link so you can dive deeper.

Top 10 Best UK Universities for Machine Learning Degrees (2025 Guide)

Explore ten UK universities that deliver world-class machine-learning degrees in 2025. Compare entry requirements, course content, research strength and industry links to find the programme that fits your goals. Machine learning (ML) has shifted from academic curiosity to the engine powering everything from personalised medicine to autonomous vehicles. UK universities have long been pioneers in the field, and their programmes now blend rigorous theory with hands-on practice on industrial-scale datasets. Below, we highlight ten institutions whose undergraduate or postgraduate pathways focus squarely on machine learning. League tables move each year, but these universities consistently excel in teaching, research and collaboration with industry.

How to Write a Winning Cover Letter for Machine Learning Jobs: Proven 4-Paragraph Structure

Learn how to craft the perfect cover letter for machine learning jobs with this proven 4-paragraph structure. Ideal for entry-level candidates, career switchers, and professionals looking to advance in the machine learning sector. When applying for a machine learning job, your cover letter is a vital part of your application. Machine learning is an exciting and rapidly evolving field, and your cover letter offers the chance to demonstrate your technical expertise, passion for AI, and your ability to apply machine learning techniques to solve real-world problems. Writing a cover letter for machine learning roles may feel intimidating, but by following a clear structure, you can showcase your strengths effectively. Whether you're just entering the field, transitioning from another role, or looking to advance your career in machine learning, this article will guide you through a proven four-paragraph structure. We’ll provide practical tips and sample lines to help you create a compelling cover letter that catches the attention of hiring managers in the machine learning job market.