Site Reliability Engineer (SRE)
Location: Remote – (Must be UK Based)
Salary:Upwards of £100,000 per annum + 15% bonus
A global leader in cybersecurity is looking for an exceptionalSite Reliability Engineerto support their mission of safeguarding data and applications worldwide. In this critical role, you’ll contribute to advancing infrastructure with cutting-edge automation and reliability practices, ensuring world-class performance and security for a widely distributed network.
About the Role
Joining an elite Infrastructure and Cloud Operations team composed of top-tier experts from major tech companies, the Site Reliability Engineer will focus on architecting a next-generation Network Automation platform. This platform will be a cornerstone in scaling infrastructure, enhancing reliability, and ensuring optimal performance across a global network.
You will be working closely with the Network Automation team, where you’ll apply core Site Reliability Engineer principles to establish metrics for availability (SLIs/SLOs) and drive infrastructure management through automation. This role provides the opportunity to influence major infrastructure decisions, contributing to reliability modelling, data-driven improvements, and operational excellence.
Responsibilities:
- Collaborate with theNetwork Automation teamto design and deploy scalable, automated infrastructure solutions.
- Develop and maintain baselines for network/system/application performance to ensure seamless operations.
- Execute pre-launch planning, validation, and reviews for both existing and new services.
- Manage escalations, incident responses, root cause analyses, and participate in a 24x7 on-call rotation.
Experience:
- Expertise in Linux systems, network programming, and core protocols likeTCP, UDP, DNS,andHTTPand ideallyBGP/Anycast routing.
- Experience working within Cloud, Web or CDN-scale infrastructure;ideally more than 3 years experience.
- Proven experience with Infrastructure as Code (e.g.,Ansible, Saltstack), CI/CD processes (e.g.,Gitlab, Jenkins) and monitoring tools (e.g.,Prometheus, Grafana)
- Experience with big data technologies, containerization (Docker, Kubernetes), and data pipeline development.
- Proven expertise in data telemetry, modelling, and visualization for operational insights.
- Excellent communication and documentation skills with a collaborative, cross-functional approach.
Apply Today
If you are a Site Reliability Engineer looking to take on a leadership role and be part of a young, growing team, please get in touch.
Email:
LinkedIn:Jared White | LinkedIn