Resumen del rol

Principal Observability & Reliability Architect

Requisitos y responsabilidades

Contenido del rol extraído en secciones para revisar más rápido.

Responsibilities

  • Lead client discovery, architecture workshops, and solution design across observability, telemetry, reliability, and operational intelligence initiatives.
  • Design enterprise observability architectures spanning monitoring, logging, metrics, tracing, telemetry pipelines, alerting, event correlation, service visibility, and platform integrations.
  • Define scalable standards for telemetry onboarding, naming, tagging, RBAC, service ownership, dashboards, alert governance, runbooks, and operational handoff.
  • Advise on telemetry governance, including data quality, retention, access control, sampling, cardinality, and cost optimization.
  • Lead modernization initiatives including tool rationalization, dashboard and alert rationalization, telemetry strategy, and migration from legacy monitoring platforms.
  • Guide SRE practices including SLIs, SLOs, error budgets, production readiness, and incident response maturity.
  • Design integration patterns across ITSM, CMDB, event management, and automation platforms.
  • Support pursuits by shaping solution strategy, validating scope, informing estimates, and building client-facing technical narratives.
  • Serve as a senior escalation point and provide architecture governance during delivery.
  • Build reusable reference architectures, playbooks, and accelerators while mentoring architects, consultants, and offshore teams.

Qualifications

  • 10+ years in observability, monitoring, APM, platform operations, SRE, or related enterprise technology domains, including 5+ years leading architecture and delivery strategy for enterprise observability or reliability initiatives.
  • Deep, hands-on experience designing and implementing across monitoring, logging, metrics, tracing, telemetry collection, and pipeline patterns in hybrid and multi-cloud environments.
  • Strong knowledge of telemetry governance, including routing, transformation, normalization, enrichment, retention, access control, and cost management.
  • Experience defining enterprise standards for dashboards, alerts, tagging, naming, service ownership, RBAC, and operating model adoption.
  • Strong command of incident response, event correlation, alert strategy, service health, and business-service visibility, plus applied SRE concepts including SLIs, SLOs, error budgets, and production readiness.
  • Ability to lead executive and technical workshops and translate business needs into actionable architecture and delivery plans.
  • Consulting or professional services experience with strong client-facing communication, estimation, risk management, and cross-functional leadership.

Preferred Qualifications

  • Platform experience such as Dynatrace, Splunk, Grafana, LogicMonitor, Datadog, New Relic, AppDynamics, Elastic, Prometheus, or OpenTelemetry.
  • Experience with telemetry pipeline tools such as OpenTelemetry Collector, Grafana Alloy, Fluent Bit, Kafka, Cribl, or Vector, along with familiarity with cloud, Kubernetes, CI/CD, and infrastructure as code.
  • Experience integrating with platforms such as ServiceNow, Jira Service Management, PagerDuty, Opsgenie, BigPanda, or xMatters.
  • Experience developing reusable consulting assets such as reference architectures, governance models, playbooks, POVs, and accelerators; relevant cloud, SRE, ITIL, or FinOps certifications are a plus.
Roles similares

Mantén una lista de respaldo.

Ver stack
FocoObservabilityÁrea del rol
Señal de senioritySeniorNivel del candidato
StackCI/CD, KubernetesSkills principales
Ubicación1 país aceptadoElegibilidad

Stack

Usa estas tags para comparar roles remotos similares.

Elegibilidad de ubicación

Candidatos deberían aplicar solo cuando el país del perfil aparece aquí.

Tu perfilPaís no definidoInicia sesión para comparar tu país con este rol.

Flujo de contratación

WithMira muestra el rol y luego envía candidatos a la aplicación de la empresa.

1Revisa fit del rol, stack y elegibilidad de ubicación en WithMira.
2Abre la página de aplicación de la empresa desde el link rastreado.
3Guarda el rol o suscríbete a oportunidades similares antes de salir.
Aplicar en el sitio de la empresaSitio de la empresaAbrir link