Baseten
Engineering Manager, Runtime Fabric
Rol remoto de Runtime Fabric con fit claro de ubicación del candidato.
Publicado9 jun 2026
Países elegibles1 país aceptado
Señal de seniorityLead
Modelo de trabajoRemoto
Ubicaciones aceptadas para candidatos
Estados Unidos
Resumen del rol
Engineering Manager, Runtime Fabric
Requisitos y responsabilidades
Contenido del rol extraído en secciones para revisar más rápido.
Details
- Recruit, hire, and develop a high-performing team of systems engineers with deep container and Linux expertise.
- Foster a culture of technical rigor, open-source contribution, and continuous improvement.
- Provide regular coaching, feedback, and career development support to your direct reports.
- Partner with engineering leadership to define the long-term vision and roadmap for container runtime and storage infrastructure.
- Guide the team in extending and hardening containerd, runc, and related OCI ecosystem projects to meet the GPU-specific requirements of production AI inference, including startup performance, GPU device access, and multi-tenant isolation.
- Oversee the architecture and evolution of the Baseten Delivery Network: the tiered caching and weight delivery system that makes cold starts 2–3x faster and eliminates thundering herd failures during burst scaling events.
- Drive the expansion of BDN's architecture, currently focused on model weights, to container images, training checkpoints, and deployment artifacts.
- Provide technical oversight on GPU-aware isolation mechanisms for multi-tenant inference, including secure container runtimes, Linux namespace hardening, and longer-term micro-VM integration.
- Ensure the team maintains end-to-end ownership of the container startup performance path, from snapshotter initialization through weight delivery to first inference request.
- Champion the team's contributions back to the open-source containerd ecosystem alongside a team of core maintainers.
- Act as the primary advocate for Runtime Fabrics across the organization, ensuring upstream and downstream teams have the integration support they need.
- Collaborate with product and engineering stakeholders to prioritize investments based on business impact and infrastructure reliability.
- Communicate team progress, technical trade-offs, and architectural decisions clearly to leadership.
- Proven experience managing and growing engineering teams in a systems, infrastructure, or low-level runtime context.
- Deep familiarity with the Linux container ecosystem: containerd, runc, OCI Runtime Spec, Linux namespaces, and cgroups, with the ability to engage credibly in code reviews and architectural discussions.
- Contributions to containerd/containerd, opencontainers/runc, google/gvisor, kata-containers/kata-containers, or closely related open-source projects.
- Strong systems programming background in Go and/or C/C++.
- Experience with distributed storage systems, content-addressable storage, or large-scale caching infrastructure.
- Understanding of how container images are structured, stored, and delivered at scale.
- Strong written and verbal communication skills, with the ability to influence without authority across teams.
- Experience with GPU device access in containers: NVIDIA Container Toolkit, CDI (Container Device Interface), or GPU-aware scheduling.
- Familiarity with lazy-loading snapshotters (stargz, soci, EROFS/Nydus) or peer-to-peer image distribution.
- Experience with secure container runtimes (gVisor, Sysbox) or micro-VM technologies (Firecracker, Cloud Hypervisor).
- Understanding of containerd's shim API (v2) and experience building custom shim implementations.
- Background in multi-tenant infrastructure or security-sensitive serving environments.
- Competitive compensation, including meaningful equity.
- 100% coverage of medical, dental, and vision insurance for employee and dependents
- Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year's Day!)
- Paid parental leave
- Fertility and family-building stipend through Carrot
- Company-facilitated 401(k)
- Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
Roles similares
Mantén una lista de respaldo.
Azure, Golang USA
Staff Backend Engineer- Session Replay| USA| RemoteGrafana LabsVer rol Azure, Golang USA
Staff Backend Engineer- Session Replay| Canada| RemoteGrafana LabsVer rol CI/CD, Kubernetes USA
Staff Backend Engineer- Grafana Enterprise| US| RemoteGrafana LabsVer rol CI/CD, Kubernetes USA
Staff Backend Engineer- Grafana Enterprise| Canada| RemoteGrafana LabsVer rol Stack
Usa estas tags para comparar roles remotos similares.
Elegibilidad de ubicación
Candidatos deberían aplicar solo cuando el país del perfil aparece aquí.
Tu perfilPaís no definidoInicia sesión para comparar tu país con este rol.
Flujo de contratación
WithMira muestra el rol y luego envía candidatos a la aplicación de la empresa.
1Revisa fit del rol, stack y elegibilidad de ubicación en WithMira.
2Abre la página de aplicación de la empresa desde el link rastreado.
3Guarda el rol o suscríbete a oportunidades similares antes de salir.