Veeam Software
Site Reliability Engineer III
Rol remoto de Site Reliability Engineering con fit claro de ubicación del candidato.
Publicado2 jul 2026
Países elegibles1 país aceptado
Señal de senioritySenior
Modelo de trabajoRemoto
Ubicaciones aceptadas para candidatos
Estados Unidos
Resumen del rol
Site Reliability Engineer III
Requisitos y responsabilidades
Contenido del rol extraído en secciones para revisar más rápido.
Discovery & Documentation
- Get up to speed on the full platform — all VDC workloads, dependencies, and risk areas. Much of this will happen through code, docs, and conversations rather than direct environment access.
- Work with SMEs across the org to fill knowledge gaps and build onboarding material for the team.
- Write and maintain runbooks, architecture docs, and operational guides.
Reliability & Incident Response
- Design infrastructure for high availability and fault tolerance on Azure (including Azure Government).
- Define SLIs, SLOs, and error budgets where none exist today.
- Run incident response and blameless postmortems. Turn incidents into improvements.
- Identify reliability risks across modern and legacy workloads and build practical remediation plans that work within compliance constraints.
Observability
- Close observability gaps — define instrumentation requirements and drive implementation.
- Set alerting, telemetry, and monitoring standards with partner teams.
- Build automation to reduce toil and support fleet management.
- Participate in on-call rotations.
Infrastructure & Delivery
- Work with IaC, CI/CD, deployment automation, and config management — including in air-gapped or compliance-restricted environments.
- Build and maintain testing, canary deployment, and release validation pipelines.
- Integrate chaos engineering and monitoring tools, adapting choices to meet regulatory requirements.
Collaboration
- Work across product, platform, security, legal, compliance, and operations teams.
- Own problems end-to-end — identify gaps, drive solutions, don't wait for direction.
- Mentor other engineers and help spread SRE practices across the org.
Technologies we work with
- Microsoft TFS, Azure DevOps, Git, BitBucket
- Azure (Entra ID, API Management, Cosmos Db, Storage services, Azure Functions, static website hosting, Azure security, etc.)
- IaC tools (Azure ARM templates, AWS CloudFormation, Terraform, the Serverless Framework, etc.)
- Observability (Azure Monitor, AppInsights, Elastic Stack)
What You'll Bring
- 7+ years in Software Engineering, with 3+ years in SRE, Platform Engineering, or similar — across multi-service platforms, not just single-service environments.
- Experience with Government or Sovereign Cloud (e.g., Azure Government, AWS GovCloud).
- Experience in regulated compliance environments — government (FedRAMP, CMMC, IL2/IL4/IL5), financial (PCI-DSS, SOX), or healthcare (HIPAA, HITRUST). You understand how compliance shapes architecture and operations.
- Strong experience building and running production services on cloud infrastructure (Azure preferred, including Azure Government).
- Able to learn large, complex platforms quickly with limited guidance — comfortable building understanding from code, docs, and architecture artifacts when direct environment access is restricted.
- Can investigate systems independently and produce clear docs, risk assessments, and improvement plans.
- Comfortable working across teams — engineering, product, security, compliance, operations.
- Programming skills in one or more of: TypeScript/JS, Go, Java, C#, or similar.
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana, OpenTelemetry, ELK stack).
- Experience with IaC (Terraform, Terragrunt, Pulumi) and container orchestration (Kubernetes).
- Experience with CI/CD and GitOps tooling — GitHub Actions, Azure DevOps, GitLab CI, ArgoCD, FluxCD, or Dagger.
- Solid grasp of distributed systems, networking, and cloud-native architecture.
- Clear written and verbal communication skills
Bonus Skills
- Experience on B2B SaaS platforms in regulated or government markets.
- Background in chaos engineering, resilience testing, or performance/load testing.
- Have built an SRE or reliability function from scratch before.
- Experience across mixed environments — modern cloud-native and older legacy systems.
- Familiar with AI-first development workflows — using LLM-powered tools for infrastructure automation, code generation, and documentation.
Why Join?
- Build the GOV reliability practice from day one — your decisions will shape how this team works.
- Help define SRE at Veeam across a globally distributed engineering org.
- Work with strong teams across product, cloud engineering, security, and compliance.
- Professional development resources including mentorship, training, and volunteer days.
- Competitive compensation and benefits.
What you'll get
- Unlimited paid time off, 12 paid holidays including 4 global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares
- Paid parental leave: 8 weeks for all parents, 16 weeks for birthing parents
- Medical, dental, and vision coverage starting on your first day
- Mental health support, therapy sessions, and digital wellness tools via our Employee Assistance Program
- 401(k) retirement plan with company matching contributions
- Fertility, adoption, and surrogacy support through Maven, plus paid volunteer time
- AirVet: 24/7 virtual veterinary care at no cost
- Legal services, identity protection, and supplemental health insurance options
- Tax-advantaged spending accounts for healthcare, dependent care, and commuting
- Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning
Roles similares
Mantén una lista de respaldo.
AWS, TypeScript 13 países aceptados
Senior Software EngineerBaltimore BannerVer rol AWS, TypeScript 8 países aceptados
Talent Community| Senior JavaScript Full Stack EngineerHiring teamVer rol AWS, CI/CD 13 países aceptados
Senior QA Automation EngineerSubway EcommerceVer rol AWS, Kubernetes 13 países aceptados
Senior Backend Engineer (AdTech)Leap ToolsVer rol Stack
Usa estas tags para comparar roles remotos similares.
Elegibilidad de ubicación
Candidatos deberían aplicar solo cuando el país del perfil aparece aquí.
Tu perfilPaís no definidoInicia sesión para comparar tu país con este rol.
Flujo de contratación
WithMira muestra el rol y luego envía candidatos a la aplicación de la empresa.
1Revisa fit del rol, stack y elegibilidad de ubicación en WithMira.
2Abre la página de aplicación de la empresa desde el link rastreado.
3Guarda el rol o suscríbete a oportunidades similares antes de salir.