Lucidya
Site Reliability Engineer
Rol remoto de Site Reliability Engineer con fit claro de ubicación del candidato.
Publicado20 jun 2026
Países elegibles1 país aceptado
Señal de senioritySenior
Modelo de trabajoRemoto
Ubicaciones aceptadas para candidatos
Arabia Saudí
Resumen del rol
Site Reliability Engineer
Requisitos y responsabilidades
Contenido del rol extraído en secciones para revisar más rápido.
You’ll make reliability the default
- You’ll design and maintain infrastructure that is highly available, fault-tolerant, and scalable
- You’ll proactively identify and eliminate single points of failure before they become incidents
You’ll make reliability the default
- You’ll ensure our production systems remain stable, even under increasing scale and load
You’ll own and optimize our cloud environments
- You’ll manage and continuously improve workloads across AWS, GCP, or Azure
- You’ll use Infrastructure as Code (Terraform) to standardize and scale infrastructure
- You’ll optimize resource usage to balance performance and cost
You’ll run and improve Kubernetes in production
- You’ll operate and scale Kubernetes clusters (EKS, GKE, etc.) with confidence
- You’ll troubleshoot issues quickly and ensure smooth deployments and upgrades
- You’ll ensure our containerized workloads perform reliably at scale
You’ll run and improve Kubernetes in production
- You’ll implement and refine monitoring systems using tools like Prometheus, Grafana, Datadog, or ELK
- You’ll define alerting that is meaningful, not noisy
- You’ll respond to incidents, lead root cause analysis, and ensure we learn from every failure
You’ll run and improve Kubernetes in production
- You’ll write scripts and build tooling to eliminate repetitive operational work
- You’ll continuously improve infrastructure efficiency through automation
- You’ll promote a culture where manual work is a temporary state, not the norm
You’ll collaborate to improve the entire system
- You’ll work closely with DevOps and engineering teams to solve performance bottlenecks
- You’ll contribute to CI/CD improvements and deployment reliability
- You’ll help shape reliability best practices across the organization
First 30 days:
- You’ve built a strong understanding of our infrastructure, systems, and workflows
- You’re contributing to day-to-day operations with support from the team
- You’ve started identifying areas for improvement in automation and reliability
By 90 days:
- You’re independently managing infrastructure tasks and troubleshooting issues
- You’re actively contributing to reliability and scalability improvements
- You’ve taken ownership of parts of our infrastructure and are improving them
Who You Are
- You’ve spent ~3 years working in SRE, DevOps, or infrastructure engineering, and you’ve seen what breaks at scale
- You’re comfortable working in cloud environments like AWS, GCP, or Azure—and you understand how distributed systems behave
- You’ve worked hands-on with Kubernetes in production and know how to troubleshoot it when things go wrong
- You don’t just fix issues - you ask why they happened and make sure they don’t happen again
Technically, you likely:
- Use Terraform (or similar IaC tools) to manage infrastructure
- Work confidently with Docker and Kubernetes
- Write scripts in Python, Bash, or similar to automate workflows
- Understand CI/CD pipelines (Jenkins, GitHub Actions, Bitbucket, etc.)
- Have a solid grasp of networking, load balancing, and high-availability design
When it comes to monitoring:
- You’ve implemented tools like Prometheus, Grafana, Datadog, or ELK
- You know the difference between useful alerts and noise
- You focus on signals that actually drive action
What sets you apart:
- You take ownership - you don’t wait to be told something is broken
- You’re calm under pressure and methodical during incidents
- You simplify complexity instead of adding to it
- You communicate clearly, even when explaining deeply technical issues
- You care about building systems that make other engineers more effective
Nice to Have (but not required)
- Experience with RabbitMQ or Redis in production
- Familiarity with Ansible or AWX
- Exposure to multi-cloud or hybrid environments
- Cloud certifications (AWS, GCP) or Linux certifications
- Background from ITI (Information Technology Institute)
What the hiring process will look like
- Screening Interview – Talent Acquisition
- Technical Interview – SRE Lead
- Technical Task
- Final Interview – SRE Lead & Cloud DevOps Director
Roles similares
Mantén una lista de respaldo.
AWS, Kubernetes 1 país aceptado
Senior Backend Engineer (AdTech)Leap ToolsVer rol AWS, Kubernetes 1 país aceptado
Senior Backend EngineerLeap ToolsVer rol CI/CD, Python 8 países aceptados
Application Security EngineerMorgan StanleyVer rol AWS, Azure 8 países aceptados
Senior DevOps EngineerFionetVer rol Stack
Usa estas tags para comparar roles remotos similares.
Elegibilidad de ubicación
Candidatos deberían aplicar solo cuando el país del perfil aparece aquí.
Tu perfilPaís no definidoInicia sesión para comparar tu país con este rol.
Flujo de contratación
WithMira muestra el rol y luego envía candidatos a la aplicación de la empresa.
1Revisa fit del rol, stack y elegibilidad de ubicación en WithMira.
2Abre la página de aplicación de la empresa desde el link rastreado.
3Guarda el rol o suscríbete a oportunidades similares antes de salir.