Embrace Software Inc
Platform Engineer (AWS)
Vaga remota de Platform Engineering com fit claro de localização do candidato.
Publicada19 de jun. de 2026
Países elegíveis1 país aceito
Sinal de senioridadeSenior
Modelo de trabalhoRemoto
Locais aceitos para candidatos
Estados Unidos
Resumo da vaga
Platform Engineer (AWS)
Requisitos e responsabilidades
Conteúdo da vaga extraído em seções para revisão mais rápida.
Details
- Own and evolve the AWS infrastructure that underpins our multi-tenant SaaS platform.
- Design, provision, and manage production-grade AWS services including EC2, S3, RDS, ECR, VPC, IAM, CloudFront, Route 53, and EKS/ECS clusters.
- Implement and maintain Infrastructure as Code (IaC) using Terraform or CloudFormation to ensure repeatable, version-controlled, and auditable environments across development, staging, and production.
- Architect and optimize PostgreSQL infrastructure including automated backups, replication, failover strategies, and performance tuning for high-throughput transactional workloads.
- Drive high availability, disaster recovery planning, scalability, and cloud cost optimization initiatives across the platform.
- Contribute to infrastructure standards, platform governance, and operational best practices.
- Build and maintain delivery pipelines that enable rapid, safe, and reliable deployments.
- Design and operate CI/CD workflows for Python (Django/Flask/FastAPI) and React applications across multiple services.
- Automate build, test, deployment, and rollback workflows using GitHub Actions, GitLab CI, Jenkins, or equivalent tooling.
- Implement deployment strategies including blue-green, canary, and rolling deployments to reduce production risk.
- Manage artifact repositories, container registries (ECR), and deployment manifests with full traceability and rollback support.
- Improve developer workflows and deployment automation to increase engineering velocity and platform reliability.
- Design, operate, and optimize our container orchestration platform for scalability, reliability, and tenant isolation.
- Manage Docker-based development and production environments, including image hardening and registry governance.
- Implement and maintain Kubernetes (EKS) or ECS infrastructure for scalable application deployments.
- Define and maintain Helm charts, Kubernetes manifests, and environment-specific deployment configurations.
- Enforce networking policies, namespace isolation, resource quotas, and workload security standards.
- Support platform scalability, cluster health, autoscaling, and operational resilience.
- Build and maintain monitoring, alerting, and observability systems using CloudWatch, Datadog, Prometheus, Grafana, or similar tooling.
- Implement centralized logging and audit trail solutions across application and infrastructure layers.
- Define operational standards for incident response, alerting, reliability, and system health monitoring.
- Enforce infrastructure security best practices including secrets management, IAM least-privilege access, network segmentation, and certificate management.
- Support compliance initiatives including SOC 2 and HIPAA through infrastructure controls, audit readiness, and vulnerability management.
- Lead incident response, root cause analysis, and blameless postmortem reviews.
- Partner with engineering teams to improve deployment reliability, operational efficiency, and developer experience.
- Troubleshoot infrastructure, deployment, networking, and performance issues across environments.
- Author and maintain infrastructure documentation, architecture diagrams, operational runbooks, and deployment playbooks.
- Mentor team members on platform engineering, infrastructure-as-code practices, operational excellence, and cloud-native tooling.
- Contribute to long-term platform scalability, automation, and engineering enablement initiatives.
- 5+ years of progressive DevOps/SRE experience in SaaS or enterprise environments.
- Infrastructure as Code using Terraform (AWS provider, modules, multi-environment state management).
- AWS core services: EKS, ECR, RDS, VPC, IAM, CloudWatch, ALB, EFS, S3, CloudFront, Route 53.
- Kubernetes administration: Helm charts, pods, deployments, services, kubectl, autoscaling.
- Docker containerization including multi-stage builds and registry operations.
- CI/CD pipelines: AWS CodeBuild, GitHub Actions, GitLab CI, or Jenkins.
- PostgreSQL production management: backup automation, replication, monitoring, performance tuning.
- Linux systems administration (Ubuntu/Amazon Linux) and shell scripting proficiency.
- Networking fundamentals: DNS, load balancing, TLS/SSL, firewall rules, VPN configurations.
- Monitoring and observability: Datadog, FluentBit, CloudWatch Logs.
- Security: AWS Secrets Manager, ACM certificates, security groups, IAM policies.
- Application stack: Django, Celery, Redis, PostgreSQL, Nginx.
- Git workflows, branching strategies, and pull request review processes.
- Strong problem-solving skills with a proactive, ownership-driven approach.
- Advanced AWS services: AWS Backup, Lambda, SNS, EventBridge.
- Advanced Kubernetes: EFS CSI driver, AWS Load Balancer Controller, Cluster Autoscaler.
- Python scripting for infrastructure automation and operational workflows.
- Multi-tenant SaaS architecture, tenant isolation strategies, and data partitioning.
- Third-party service integration (SendGrid, Twilio) at the infrastructure level.
- FinOps practices: cloud cost management, reserved/spot instance optimization.
- Compliance frameworks (SOC 2 Type II, HIPAA) and required infrastructure controls.
- Service mesh technologies (Istio, Linkerd) or API gateway solutions.
- Cluster management tools like Rancher.
- Database disaster recovery: snapshots, cloning, multi-region considerations.
- Container security scanning and ClamAV integration.
- Infrastructure documentation and multi-environment workflows (dev → stg → prod).
- AWS certifications (Solutions Architect, DevOps Engineer Professional).
Good-to-Have SkillsAdvanced AWS services: AWS Backup, Lambda, SNS, EventBridge.Advanced Kubernetes: EFS CSI driver, AWS Load Balancer Controller, Cluster Autoscaler.Python scripting for infrastructure automation and operational workflows.Multi-tenant SaaS architecture, tenant isolation strategies, and data partitioning.Third-party service integration (SendGrid, Twilio) at the infrastructure level.FinOps practices: cloud cost management, reserved/spot instance optimization.Compliance frameworks (SOC 2 Type II, HIPAA) and required infrastructure controls.Service mesh technologies (Istio, Linkerd) or API gateway solutions.Cluster management tools like Rancher.Database disaster recovery: snapshots, cloning, multi-region considerations.Container security scanning and ClamAV integration.Infrastructure documentation and multi-environment workflows (dev → stg → prod).AWS certifications (Solutions Architect, DevOps Engineer Professional).Benefits
- Competitive salary commensurate with experience.
- Opportunities for career advancement and professional development.
- Experience collaborating with a diverse, global team within a remote work setting.
Vagas similares
Mantenha uma lista reserva.
AWS, Kubernetes 1 país aceito
Senior Backend Engineer (AdTech)Leap ToolsVer vaga AWS, Kubernetes 1 país aceito
Senior Backend EngineerLeap ToolsVer vaga CI/CD, Python 8 países aceitos
Application Security EngineerMorgan StanleyVer vaga CI/CD, React 1 país aceito
Senior Full Stack EngineerSureifyVer vaga Stack
Use estas tags para comparar vagas remotas similares.
Elegibilidade de localização
Candidatos devem aplicar apenas quando o país do perfil estiver listado aqui.
Seu perfilPaís não definidoEntre para comparar seu país com esta vaga.
Fluxo de contratação
O WithMira mostra a vaga e depois envia candidatos para a aplicação da empresa.
1Confira fit da vaga, stack e elegibilidade de localização no WithMira.
2Abra a página de aplicação da empresa pelo link rastreado.
3Salve a vaga ou assine oportunidades similares antes de sair.