Lucidya

Site Reliability Engineer

Rol remoto de Site Reliability Engineer con fit claro de ubicación del candidato.

Publicado20 jun 2026

Países elegibles1 país aceptado

Señal de senioritySenior

Modelo de trabajoRemoto

Ubicaciones aceptadas para candidatos

Arabia Saudí

AWS Azure CI/CD Docker GCP Kubernetes Python React

Puedo aplicar realmente?Revisa la lista de países

Las ubicaciones aceptadas para candidatos están listadas (1).

Actualidad de la fuente20 jun 2026

Fit de ubicación1 país aceptado

Match de stackAWS, Azure

Camino de aplicaciónSitio de la empresa

Resumen de fit de MiraPor qué vale revisar este rol

Fit de ubicación1 país aceptadoAgrega tu país

Match de stackAgrega skills al perfil para compararAWS, Azure

Señal de senioritySeniorDefine tu nivel para una revisión más precisa.

Preparación para aplicarSitio de la empresaLa aplicación continúa en el sitio de la empresa.

Aplicación

Aplicar en el sitio de la empresa

Aplicación externa

Aplicando aSite Reliability EngineerLucidya

Fit de país1 país aceptado

Camino de aplicaciónSitio de la empresa

WithMiraGuarda o suscríbete antes de salir

Aplicación de la empresa

WithMira mantiene este rol para descubrimiento. La aplicación continúa en el sitio de la empresa.

Aplicar en el sitio de la empresa

Guardar rol

Resumen del rol

Site Reliability Engineer

Requisitos y responsabilidades

Contenido del rol extraído en secciones para revisar más rápido.

You’ll make reliability the default

You’ll design and maintain infrastructure that is highly available, fault-tolerant, and scalable
You’ll proactively identify and eliminate single points of failure before they become incidents

You’ll make reliability the default

You’ll ensure our production systems remain stable, even under increasing scale and load

You’ll own and optimize our cloud environments

You’ll manage and continuously improve workloads across AWS, GCP, or Azure
You’ll use Infrastructure as Code (Terraform) to standardize and scale infrastructure
You’ll optimize resource usage to balance performance and cost

You’ll run and improve Kubernetes in production

You’ll operate and scale Kubernetes clusters (EKS, GKE, etc.) with confidence
You’ll troubleshoot issues quickly and ensure smooth deployments and upgrades
You’ll ensure our containerized workloads perform reliably at scale

You’ll run and improve Kubernetes in production

You’ll implement and refine monitoring systems using tools like Prometheus, Grafana, Datadog, or ELK
You’ll define alerting that is meaningful, not noisy
You’ll respond to incidents, lead root cause analysis, and ensure we learn from every failure

You’ll run and improve Kubernetes in production

You’ll write scripts and build tooling to eliminate repetitive operational work
You’ll continuously improve infrastructure efficiency through automation
You’ll promote a culture where manual work is a temporary state, not the norm

You’ll collaborate to improve the entire system

You’ll work closely with DevOps and engineering teams to solve performance bottlenecks
You’ll contribute to CI/CD improvements and deployment reliability
You’ll help shape reliability best practices across the organization

First 30 days:

You’ve built a strong understanding of our infrastructure, systems, and workflows
You’re contributing to day-to-day operations with support from the team
You’ve started identifying areas for improvement in automation and reliability

By 90 days:

You’re independently managing infrastructure tasks and troubleshooting issues
You’re actively contributing to reliability and scalability improvements
You’ve taken ownership of parts of our infrastructure and are improving them

Who You Are

You’ve spent ~3 years working in SRE, DevOps, or infrastructure engineering, and you’ve seen what breaks at scale
You’re comfortable working in cloud environments like AWS, GCP, or Azure—and you understand how distributed systems behave
You’ve worked hands-on with Kubernetes in production and know how to troubleshoot it when things go wrong
You don’t just fix issues - you ask why they happened and make sure they don’t happen again