Resumen del rol

Staff Site Reliability Engineer

Requisitos y responsabilidades

Contenido del rol extraído en secciones para revisar más rápido.

What your impact will be

  • Lead the development of Domino's internal AI-assisted reliability tooling, including systems that analyze tickets, logs, traces, and documentation to help teams resolve outages faster with less recurring toil
  • Improve the observability coverage and signal quality for our most critical customer-facing systems, so engineers have more to work with throughout the development and support lifecycle
  • Own incident response end-to-end, from detection to remediation, and leave each problem space better documented, better understood, and less likely to recur
  • Guide the development of customer and user-facing observability tools within our products
  • Define and mature SLO/SLI frameworks for priority services, turning abstract reliability goals into measurable, actionable standards
  • Scale cloud operations practices for Domino’s single-tenant SaaS offering, and work with engineering teams to improve the reliability and repeatability of customer deployments and upgrades
  • Mentor other engineers and shape how SRE is practiced at Domino, including incident response workflows, operational readiness expectations, and post-incident learning culture

What we look for in this role

  • Deep experience in Site Reliability Engineering, platform engineering, or a software engineering role with genuine, hands-on operational ownership
  • Fluency with Kubernetes, Linux, cloud platforms, and observability tooling, and the ability to use them to investigate complex, real-world production problems
  • A strong ability to perceive and close reliability gaps in technical products, tools and processes
  • Strong software engineering skills in Python or Go, with a track record of building internal tools or services that people actually rely on
  • Comfort leading technically ambiguous work and influencing direction across teams without needing direct authority to get things done
  • A history of improving reliability through engineering and automation, not just putting out fires manually
  • Strong communication skills and real experience mentoring engineers or shaping technical decision-making on your team
  • Sound judgment about AI/LLM tooling: you know where it genuinely helps in operational workflows and where it adds noise instead of signal
  • Bonus: Experience with LLM-based systems, retrieval workflows, SaaS platform operations, or building tooling for support or developer teams

What we value

  • We strongly believe in the value of growing a diverse team and encourage people of all backgrounds, genders, ethnicities, abilities, and sexual orientations to apply
  • We value a growth mindset. High-performing creative individuals who dig into problems and see the opportunities for success
  • We believe in individuals who seek truth and speak the truth and can be their whole selves at work.
  • We value all of you that believe improving is always possible. At Domino, everything is a work in progress – we can do better at everything.
  • We emphasize an environment of teaching and learning to equip employees with the tools needed to be successful in their function and the company.
Roles similares

Mantén una lista de respaldo.

Ver stack
FocoSite Reliability EngineeringÁrea del rol
Señal de senioritySeniorNivel del candidato
StackKubernetes, LLM, PythonSkills principales
Ubicación1 país aceptadoElegibilidad

Stack

Usa estas tags para comparar roles remotos similares.

Elegibilidad de ubicación

Candidatos deberían aplicar solo cuando el país del perfil aparece aquí.

Tu perfilPaís no definidoInicia sesión para comparar tu país con este rol.

Flujo de contratación

WithMira muestra el rol y luego envía candidatos a la aplicación de la empresa.

1Revisa fit del rol, stack y elegibilidad de ubicación en WithMira.
2Abre la página de aplicación de la empresa desde el link rastreado.
3Guarda el rol o suscríbete a oportunidades similares antes de salir.
Aplicar en el sitio de la empresaSitio de la empresaAbrir link