Resumo da vaga

Site Reliability Engineer

Requisitos e responsabilidades

Conteúdo da vaga extraído em seções para revisão mais rápida.

Details

  • Design and execute a comprehensive infrastructure strategy that proactively supports evolving business requirements and operational excellence.
  • Own the predictable delivery of high-complexity technical solutions through deep automation using Kubernetes and sophisticated CI/CD pipelines.
  • Maintain superior portal availability and system health by implementing advanced observability and distributed tracing strategies.
  • Lead high-severity incident response efforts and drive systemic improvements through insightful, blameless postmortem analysis.
  • Architect failure-resilient and self-healing infrastructure systems to ensure continuous operational stability and zero data loss.
  • Serve as the internal subject matter expert to influence software architecture decisions toward maximum scalability and performance.
  • Facilitate regular knowledge-sharing and training sessions to elevate technical standards and process predictability across the entire technology department.
  • Direct security initiatives and design secure networking strategies to maintain a high-standard protection framework for all client data and assets.
  • 4–7 years of professional experience building and managing resilient, modern infrastructure within a fast-paced environment.
  • Expert-level proficiency in managing and troubleshooting Linux-based servers across multiple distributions.
  • Advanced capability in developing modular, reusable infrastructure templates using tools such as Terraform and Ansible.
  • Proven success in managing containerized workloads at scale using Kubernetes and Helm.
  • Extensive experience configuring and optimizing high-performance database environments, specifically MySQL.
  • Demonstrated ability to build robust, secure CI/CD deployment pipelines that include automated rollback and quality gates.
  • Strong technical documentation skills, including the creation of architectural diagrams, detailed specifications, and operational playbooks.
  • Ability to lead cross-functional projects independently while mentoring junior engineers and driving team-wide initiatives.
  • Deep understanding of observability platforms such as New Relic, Datadog, or Prometheus to measure and improve system reliability.
  • Expertise in designing secure cloud networking strategies including firewalls, VPNs, and identity management best practices.
  • Advanced scripting and programming proficiency in Python or similar languages to automate complex operational workflows.
  • Strategic insight into infrastructure ROI and the ability to align technical roadmaps with broad business priorities.
  • Practical knowledge of disaster recovery planning and the execution of failure-resilient system designs.
Vagas similares

Mantenha uma lista reserva.

Ver stack
FocoSite Reliability EngineeringÁrea da vaga
Sinal de senioridadeSeniorNível do candidato
StackCI/CD, Kubernetes, PythonSkills principais
Localização1 país aceitoElegibilidade

Stack

Use estas tags para comparar vagas remotas similares.

Elegibilidade de localização

Candidatos devem aplicar apenas quando o país do perfil estiver listado aqui.

Seu perfilPaís não definidoEntre para comparar seu país com esta vaga.

Fluxo de contratação

O WithMira mostra a vaga e depois envia candidatos para a aplicação da empresa.

1Confira fit da vaga, stack e elegibilidade de localização no WithMira.
2Abra a página de aplicação da empresa pelo link rastreado.
3Salve a vaga ou assine oportunidades similares antes de sair.
Aplicar no site da empresaSite da empresaAbrir link