Principal Software Engineer - Data Platform

Rapid7
Belfast
7 months ago
Applications closed

Related Jobs

View all jobs

Principal Software Engineer

Data Engineer - Snowflake

Principal Data Engineer

Principal Data Engineer

Senior Machine Learning Engineer

Principal Data Scientist

As a Principal Engineer, you’ll get the opportunity to be a hands-on engineer, learning best practice engineering processes and approaches whilst receiving ongoing development through coaching, mentoring and pairing with other engineers on your team. From problem-solving to challenging old ways of thinking, you will have the opportunity to unleash your full potential and creativity whilst working with cutting edge technologies in a dynamic and collaborative team.

About the Team

The Data Platform team is responsible for building ETL Pipelines that fuel the Data Platform at Rapid7. Moving Product Data into our Data Platform for product teams to develop new features, enhance existing features and build shared experiences to create value for customers across the world.

We have a cutting edge data stack including Kafka, K8s, Spark and Iceberg.

About the Role


The Principal Engineer role is a part of our Data Platform Engineering team. In this role you will be focussed on helping our product teams move data into our Data Platform for in product experiences and product analytics.

As a Principal Engineer on the Data Platform Engineering team, you will be responsible for architecting and scaling streaming and batch data pipelines, while also designing the CI/CD infrastructure that ensures efficient development and deployment of data services. You will play a key role in shaping the architecture of our data platform, collaborating with cross-functional teams to deliver highly available, performant, and scalable solutions for both real-time and large-scale data processing.

In this role, you will:

Architect and implement a highly scalable Data Platform that supports Change Data Capture (CDC) using Debezium and Kafka for data replication across different databases and services.

Design and maintain large-scale data lakes using Apache Iceberg, ensuring efficient data partitioning, versioning, and schema evolution to support real-time analytics and historical data access.

Build and optimize CI/CD pipelines for the deployment and automation of data platform services using tools like Jenkins.

Lead the integration of Apache Spark for large-scale data processing and ensure that both batch and streaming workloads are handled efficiently.

Collaborate with our Platform Delivery teams to ensure high availability and performance of the data platform, implementing monitoring, disaster recovery, and automated testing frameworks.

Provide technical leadership and mentoring to junior engineers, promoting best practices in CDC architecture, distributed systems, and CI/CD automation.

Ensure that the platform adheres to data governance principles, including data lineage tracking, auditing, and compliance with regulatory requirements.

Stay informed about the latest advancements in CDC, data engineering, and infrastructure automation to guide future platform improvements.

Work closely with product and data science teams to understand business requirements and translate them into scalable and efficient data platform solutions.

Stay current with the latest trends in data engineering and infrastructure, making recommendations for improvements and introducing new technologies as appropriate.

The skills you’ll bring include:

10+ years of experience in software engineering with a focus on data platform engineering, data infrastructure, or distributed systems.

Expertise in building data pipelines using Apache Kafka or similar for ingesting, processing, and distributing high-throughput data.

Strong experience designing and managing CI/CD pipelines for data platform services using tools such as Jenkins.

Experience with Apache Iceberg (or similar Delta Lake/Apache Hudi) for managing versioned, partitioned datasets in data lakes with an understanding of Apache Spark for both batch and streaming data processing, including optimization strategies for distributed data workloads.

Expertise in designing distributed systems and managing high-throughput, fault-tolerant, and low-latency data architectures.

Strong programming skills in Java, Scala, or Python.

Experience with cloud-based environments (AWS, GCP, Azure) and containerized infrastructure using Kubernetes and Docker.

The attitude and ability to thrive in a high-growth, evolving environment

Collaborative team player who has the ability to partner with others and drive toward solutions

Strong creative problem solving skills

Solid communicator with excellent written and verbal communications skills both within the team and cross functionally

Passionate about delighting customers, puts the customer needs at the forefront of all decision making

Excellent attention to detail


We know that the best ideas and solutions come from multi-dimensional teams. That’s because these teams reflect a variety of backgrounds and professional experiences. If you are excited about this role and feel your experience can make an impact, please don’t be shy - apply today.

#LI_FB1

Get the latest insights and jobs direct. Sign up for our newsletter.

By subscribing you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Machine Learning Jobs at Newly Funded UK Start-ups: Q3 2025 Investment Tracker

Machine learning (ML) has become the beating heart of modern tech innovation, powering breakthroughs in healthcare, finance, cybersecurity, robotics, and more. Across the United Kingdom, this surge in ML-driven solutions is fueling the success of countless start-ups—and spurring demand for talented machine learning engineers, data scientists, and related professionals. If you’re eager to join a high-growth ML company or simply want to keep tabs on the latest trends, this Q3 2025 Investment Tracker will guide you through the newly funded UK start-ups pushing the boundaries of ML. In this article, we’ll highlight key developments from Q3 2025, delve into the most promising newly funded ventures, and shed light on the machine learning roles they’re urgently seeking to fill. Plus, we’ll show you how to connect with these employers via MachineLearningJobs.co.uk, a dedicated platform for ML job seekers. Let’s dive in!

Portfolio Projects That Get You Hired for Machine Learning Jobs (With Real GitHub Examples)

In today’s data-driven landscape, the field of machine learning (ML) is one of the most sought-after career paths. From startups to multinational enterprises, organisations are on the lookout for professionals who can develop and deploy ML models that drive impactful decisions. Whether you’re an aspiring data scientist, a seasoned researcher, or a machine learning engineer, one element can truly make your CV shine: a compelling portfolio. While your CV and cover letter detail your educational background and professional experiences, a portfolio reveals your practical know-how. The code you share, the projects you build, and your problem-solving process all help prospective employers ascertain if you’re the right fit for their team. But what kinds of portfolio projects stand out, and how can you showcase them effectively? This article provides the answers. We’ll look at: Why a machine learning portfolio is critical for impressing recruiters. How to select appropriate ML projects for your target roles. Inspirational GitHub examples that exemplify strong project structure and presentation. Tangible project ideas you can start immediately, from predictive modelling to computer vision. Best practices for showcasing your work on GitHub, personal websites, and beyond. Finally, we’ll share how you can leverage these projects to unlock opportunities—plus a handy link to upload your CV on Machine Learning Jobs when you’re ready to apply. Get ready to build a portfolio that underscores your skill set and positions you for the ML role you’ve been dreaming of!

Machine Learning Job Interview Warm‑Up: 30 Real Coding & System‑Design Questions

Machine learning is fuelling innovation across every industry, from healthcare to retail to financial services. As organisations look to harness large datasets and predictive algorithms to gain competitive advantages, the demand for skilled ML professionals continues to soar. Whether you’re aiming for a machine learning engineer role or a research scientist position, strong interview performance can open doors to dynamic projects and fulfilling careers. However, machine learning interviews differ from standard software engineering ones. Beyond coding proficiency, you’ll be tested on algorithms, mathematics, data manipulation, and applied problem-solving skills. Employers also expect you to discuss how to deploy models in production and maintain them effectively—touching on MLOps or advanced system design for scaling model inferences. In this guide, we’ve compiled 30 real coding & system‑design questions you might face in a machine learning job interview. From linear regression to distributed training strategies, these questions aim to test your depth of knowledge and practical know‑how. And if you’re ready to find your next ML opportunity in the UK, head to www.machinelearningjobs.co.uk—a prime location for the latest machine learning vacancies. Let’s dive in and gear up for success in your forthcoming interviews.