Site Reliability Engineer
Department: Engineering
Employment Type: Full Time
Location: London
Reporting To: Head of Site Reliability Engineering
Compensation: £85,000 - £90,000 / year
Description
Reward Gateway, together with Edenred, are a global market leader in benefits and employee engagement. We help our clients and their leaders to transform employee experience that will attract, engage and retain top talent through employee benefits, strategic reward and recognition, well-being, and much more.
With our shared missions of ‘Making the World a Better Place to Work” and ‘Enriching connections, For good’. You’ll be contributing to improving employee engagement and building better, stronger and more resilient organisations to improve people’s daily lives. Our shared mission guides our every action and charts a sustainable path to a better future.
We have a highly talented team who live up to our shared values, bringing to life “Entrepreneurial Spirit”. We love to “Push the Boundaries”, but importantly show “Respect” in how we go about our work. Our “Speak Up” and “Be Human” culture is at the heart of what it is to work with us, and we encourage everyone to bring innovation and ‘Imagination’ to the work they do.
Your Role in our Mission:
We are seeking a Site Reliability Engineer to help us transform our existing operational workloads to an SRE approach. You will embed with our Product Engineering teams to drive high availability, reliability, and uptime. In this role, you will use a code-first approach to reduce toil, advance our observability platforms using SLIs/SLOs, and ensure high compliance. You will actively manage operational success by serving as a key Incident Commander, participating in the on-call rotation, and balancing platform cost efficiency through agile collaboration.
What’s In It For Me?
A chance to be part of an extremely well established, stable and high growth ‘Unicorn’ SaaS company with over 50 benefits in our employee benefits package, including:
- A flexible holiday plan of up to 40 days per year
- £400 a year Wellbeing Allowance
- Private Medical Insurance
- Allowance for professional development books, E-books, and podcasts
- Contributory pension scheme
- Employee, friends and family discounts across 1200+ retail, hospitality and lifestyle brands
Click
here to see our full suite of benefits and perks dedicated to supporting all aspects of employee wellbeing!
Flexible, Hybrid Working: Collaboration, connection as a team, and strong internal relationships are part of the “RG Magic” that makes our culture thrive. Our teams work from our
Dean Street office two days per week.
What You’ll be Doing:
- Integrating tightly with our Product Engineering teams
- Following SRE practices and maintaining high standards of compliance
- Implementing a new standard of observability utilising SLI/SLO/Error Budgets
- Continually evolving our observability platforms for greater coverage
- Using a code-first approach to build and changes to reduce TOIL
- Advocating a strong focus on availability, reliability and uptime
- Liaising and embedding with the Engineering teams for the constant evolution of metrics
- Working towards planned roadmap goals
- Actively taking part in the daily stand-ups and keeping sprints on track
- Keeping up-to-date documentation in the JIRA & Confluence tools
- Taking part in SRE Incident Management processes
- Acting as a key Incident Commander within the Incident Management process
- Taking part in SRE On Call
- Ensuring a focus on cost efficiency for the platforms & services
- Working with team members to foster collaboration and ongoing communication with stakeholders
Experience and Skills You Need in this Role:
- Proven experience in DevOps or SRE, with a keen interest in growing as a Site Reliability Engineer
- Experience with AWS or other cloud providers
- Enterprise experience in HA environments
- Automation skills through Terraform, Python, Bash or similar
- Wide-reaching SRE skills and a deep understanding of SRE practices
- A strong understanding of SQL, PHP, Kubernetes, CI/CD
- Observability product experience (e.g., Datadog)
- Managing services using SLI/SLO & Error Budgets
- Ability to work both independently and as part of a team
- Ability to work under pressure and be highly reliable
- Adaptability and flexibility to change in a fast-moving environment
- An ability to learn new tools and processes quickly and impart that knowledge
The Interview Process:
- Screening interview with the Talent Partner and Head of SRE
- Final interview with the Head of SRE and the Director of Infrastructure.
At Reward Gateway | Edenred, we are committed to ensuring an inclusive and accessible recruitment process for all candidates. If you have any specific requirements or need reasonable adjustments at any stage of the recruitment journey, please let your Talent Acquisition Partner know. Your needs are important to us, and we want to ensure an equitable experience for every candidate.
Be comfortable. Be you.We want every employee to feel comfortable bringing their passion, creativity and individuality to work. We value all cultures, backgrounds and experiences, because we believe diversity drives innovation and makes us stronger. Our approach to hiring and building teams is about more than filling roles - it’s about creating an environment where everyone can thrive, feel supported, and contribute to our mission of making the world a better place to work!