Resumen del rol

Principal Software Engineer-SRE

Requisitos y responsabilidades

Contenido del rol extraído en secciones para revisar más rápido.

Details

  • Lead design, implementation, and evolution of reliability, availability, and resiliency strategies for large‑scale distributed systems written primarily inJava
  • Apply deep experience operating complex, distributed systems to guide architectural decisions, reliability strategies, and long‑term system evolution
  • Identifysystemic risks in application architecture, data flows, and infrastructure, and drive architectural improvements that measurably improve availability, performance, and scalability
  • Set and evolve reliability standards, best practices, and operational principles across R&D
  • Lead efforts to prevent, detect, and mitigate incidents through technical improvements and operational maturity
  • Serve as a senior coordination point during major incidents, helping manage response and guide long‑term remediation
  • Champion blameless post-incident reviews and ensure learnings translate into durable system improvements
  • Apply advanced software engineering practices toeliminatemanual work, reduce operational load, and improve system observability
  • Design and build internal platforms, automation, and tooling that support Java‑based services and their operational needs
  • Raise the bar on monitoring, alerting, and SLO/SLI adoption across systems
  • Partner deeply with product engineers, architects, and engineering leadership to ensure reliability and operability are first‑class concerns in system design
  • Review and influence designs for complex systems involving technologies such asdatastores, messaging systems, and coordination services
  • Serve as a technical mentor and coach for SREs and other engineers, raising overall engineering and operational maturity
  • Contribute to longer‑term reliability and infrastructure strategy aligned with business growth
  • Stay current with industry trends in SRE, distributed systems, and the Java ecosystem, turning insights into practical improvements
  • Help define what “great reliability” looks like for the organization and how we measure it
  • US Citizenship or Permanent Residents only due to ITAR requirements.
  • Ability towork east coast (EST) hours. And be available for on-call rotation once every10 weeks.
  • 10+years of experience in software engineering, site reliability engineering, or systems engineering roles
  • Extremely strongproficiencywith the Java programming language and its ecosystem, including building, debugging, andoperatingproduction Java services
  • Deep experience operating complex, distributed systems in production environments
  • Strong software engineering background, witha track recordof delivering high‑quality, maintainable code
  • Expert understanding of incident management, service reliability, and performance engineering
  • Strong hands‑on experience with observability (metrics, logs, traces), capacity planning, and SLO‑driven reliability
  • Deep familiarity with modern cloud‑based infrastructure, CI/CD pipelines, and infrastructure‑as‑code practices
  • Ability to reason about failure modes across application, data, and infrastructure layers
  • Demonstrated ability to lead complex initiatives that span teams and organizational boundaries
  • Comfortable making high‑impact technical decisions in ambiguous environments
  • Strong communicator who can influence design and operational decisions across a wide range of stakeholders
  • Systems thinker focused on root‑cause analysis and durable fixes
  • Calm and effective under pressure, especially during high‑severity incidents
  • Curious, data‑driven, and committed to continuous improvement
  • Experienceoperatingor supporting systems using technologies such asMongoDB,ZooKeeper, andRabbitMQ
  • Background in performance tuning and scalability optimization of Java services
  • Experience setting or influencing engineering standards at the organization level
  • Prior involvement in evolving SRE or platform practices in a growing engineering organization
  • Experience designing,operating, or scaling systems in cloud environments such as AWS (preferred), including familiarity with core services, networking models, and reliability features
Roles similares

Mantén una lista de respaldo.

Ver stack
FocoSite Reliability EngineeringÁrea del rol
Señal de senioritySeniorNivel del candidato
StackAWS, CI/CD, JavaSkills principales
Ubicación1 país aceptadoElegibilidad

Stack

Usa estas tags para comparar roles remotos similares.

Elegibilidad de ubicación

Candidatos deberían aplicar solo cuando el país del perfil aparece aquí.

Tu perfilPaís no definidoInicia sesión para comparar tu país con este rol.

Flujo de contratación

WithMira muestra el rol y luego envía candidatos a la aplicación de la empresa.

1Revisa fit del rol, stack y elegibilidad de ubicación en WithMira.
2Abre la página de aplicación de la empresa desde el link rastreado.
3Guarda el rol o suscríbete a oportunidades similares antes de salir.
Aplicar en el sitio de la empresaSitio de la empresaAbrir link