Resumo da vaga

Senior Site Reliability Engineer (SRE)

Requisitos e responsabilidades

Conteúdo da vaga extraído em seções para revisão mais rápida.

Security Clearance Requirements

  • Have the right to work in the UK
  • Have lived in the UK continuously for the past 5 years
  • Not have spent more than 6 months outside the UK in total during that period
  • Be willing to undergo security vetting as part of the onboarding process

Key Responsibilities

  • Operate, harden and extend production OpenShift / OKD / Kubernetes clusters across on-premises and hybrid environments.
  • Support the migration from VMware to KVM, helping modernise the underlying compute and storage layer.
  • Own and improve CI/CD processes across the full lifecycle of platform and application components.
  • Work with platform and application engineers to support cloud-native delivery using tools such as Helm and Kustomize.
  • Develop and mature GitOps deployment practices using tools such as Argo CD or Flux.
  • Maintain and improve core platform services including identity, ingress, observability, certificate management, service mesh and container registry capabilities.
  • Build and operate observability across logs, metrics, traces, alerting, SLOs and error budgets.
  • Improve platform hardening in line with secure and regulated environment requirements, including network policy, SELinux, image provenance, secret management and audit.
  • Automate repeatable operational tasks using tools such as Ansible, Terraform, Helm, Kustomize, Go, Python or equivalent technologies.
  • Lead incident response activity, support blameless post-mortems and drive systemic fixes.
  • Partner with networking and security teams on platform integration, segmentation, load balancing and accreditation evidence.
  • Create and maintain clear technical documentation, runbooks, design notes and operational guidance.
  • Mentor other engineers and act as a senior technical authority across cloud and Kubernetes operations.
  • Participate in an on-call rota, with appropriate compensation.

Success in This Role Looks Like

  • A more reliable, secure and measurable production Kubernetes estate.
  • Improved platform observability, with meaningful alerting, SLOs and trend data that engineering teams actively use.
  • Progress against the VMware to KVM migration, with a clear and automated path for the underlying infrastructure layer.
  • A mature GitOps approach covering platform and application components, including rollback, drift detection and operational control.
  • Improved CI/CD practices that help teams move at pace while considering security, QA and compliance earlier in the lifecycle.
  • Well-documented, supportable and scalable platform services.
  • Stronger incident response, clearer runbooks and post-mortems that lead to real operational improvements.
  • Recognition as a technical authority for Kubernetes, cloud and platform operations across the organisation.

Essential Experience & Skills

  • Strong experience running production Kubernetes environments, not just consuming or deploying into them.
  • Strong Linux fundamentals, including systemd, networking, storage and performance troubleshooting.
  • Experience with at least one Kubernetes distribution such as OKD, OpenShift, vanilla Kubernetes, Rancher, EKS, AKS or GKE.
  • Solid infrastructure as code experience, including Ansible plus Terraform or equivalent, alongside tools such as Helm and Kustomize.
  • GitOps and CI/CD experience managing full application and component lifecycles, using tools such as Argo CD, Flux, GitHub Actions or similar.
  • Prometheus, Grafana, Elastic Stack / LGTM, OpenTelemetry or similar.
  • Experience working with identity and access technologies such as OIDC, SAML, SCIM or Keycloak.
  • Experience with virtualisation or infrastructure platforms such as KVM, libvirt or VMware.
  • Scripting or tooling experience using Go, Python, shell scripting or similar.
  • Strong troubleshooting, problem-solving and analytical skills.
  • Experience working in secure, regulated or enterprise-scale environments.
  • Strong communication skills, with the ability to produce clear documentation, runbooks, post-mortems and technical guidance.
  • Eligible to hold UK SC clearance.

Desirable (Not Essential)

  • Specific OpenShift or OKD experience, including operators, MachineConfig or SCCs.
  • Service mesh experience such as Istio or Linkerd.
  • Policy engine experience such as OPA, Gatekeeper or Kyverno.
  • Cloud-native application deployment experience using Helm, Terraform, Kustomize or similar.
  • Storage experience such as Ceph, Longhorn, OpenShift Data Foundation or equivalent.
  • Networking experience including BGP, VXLAN, Palo Alto or Juniper technologies.
  • Software supply chain security experience, including SBOMs, image signing, admission control or tools such as Sigstore.
  • Experience operating AI, ML or GPU-enabled platforms.
  • CKA, CKAD, CKS, Red Hat certifications or equivalent.
  • Active or recent UK SC clearance.
  • Recognised open-source contributions to the Kubernetes ecosystem.

Soft Skills & Behaviours

  • Calm, structured and methodical under pressure.
  • Strong written and verbal communication skills.
  • Collaborative working style across platform, development, QA, security, networking and architecture teams.
  • Strong sense of ownership and accountability.
  • Automation-first mindset, with a focus on removing repeatable manual work.
  • Able to influence technical practice through evidence, example and credibility.
  • Pragmatic and solutions-focused approach to problem solving.
  • Curious about why systems fail, not just how to bring them back online.
  • Comfortable mentoring others and raising the technical capability of those around them.
  • Able to balance reliability, delivery pace, security and compliance in a regulated environment.

Benefits

  • Private Medical
  • Health Cash Plan
  • 4x Life Assurance
  • Inclusive Culture: Enjoy an inclusive culture and environment.
  • Holiday: Generous holiday allowance.
  • Learning: Access to continuous learning and development opportunities.
  • Bonus Potential: Bonus potential based on performance and business-related factors.
  • Discounts: Discounts on a wide range of products and services.
  • Pension: Pension scheme contributions.
  • EV Car Scheme
  • Regular Pay Reviews
  • More Benefits: Explore additional benefits on our career site.
Vagas similares

Mantenha uma lista reserva.

Ver stack
FocoSite Reliability EngineeringÁrea da vaga
Sinal de senioridadeSeniorNível do candidato
StackCI/CD, Kubernetes, PythonSkills principais
Localização1 país aceitoElegibilidade

Stack

Use estas tags para comparar vagas remotas similares.

Elegibilidade de localização

Candidatos devem aplicar apenas quando o país do perfil estiver listado aqui.

Seu perfilPaís não definidoEntre para comparar seu país com esta vaga.

Fluxo de contratação

O WithMira mostra a vaga e depois envia candidatos para a aplicação da empresa.

1Confira fit da vaga, stack e elegibilidade de localização no WithMira.
2Abra a página de aplicação da empresa pelo link rastreado.
3Salve a vaga ou assine oportunidades similares antes de sair.
Aplicar no site da empresaSite da empresaAbrir link