Railway
Senior Product Engineer, Scalability
Vaga remota de Engineering com fit claro de localização do candidato.
Publicada11 de jun. de 2026
Países elegíveis1 país aceito
Sinal de senioridadeSenior
Modelo de trabalhoRemoto
Locais aceitos para candidatos
Estados Unidos
Resumo da vaga
Senior Product Engineer, Scalability
Requisitos e responsabilidades
Conteúdo da vaga extraído em seções para revisão mais rápida.
About the role
- Architect and scale the pipelines that turn raw usage into accurate, real-time billing — metering, aggregation, rating, and invoicing across millions of events, from ingestion in ClickHouse to the rating engine.
- Build payment flows that are correct under concurrency and partial failure: idempotent charges, retries, reconciliation, and clean handling of provider edge cases (Stripe and beyond).
- Develop fraud and abuse detection — signal collection, real-time scoring, automated mitigation — that protects platform margin without getting in legitimate users' way.
- Scale the systems everything else depends on: Postgres under heavy write load, Node.js services under pressure, and long-running workflows orchestrated with Temporal where exactly-once semantics and durability actually matter.
- Build TypeScript + GraphQL APIs where correctness and auditability are non-negotiable.
- Write Engineering Requirement Documents to take something from idea, to defined tasks, to implementation, to monitoring its success and scaling it further.
- Contribute to our open-source repositories (CLI, Typescript SDK, Railpack, etc.) — Rust experience, or the desire to learn it, helps here.
- Be oncall from time to time.
About the role
- Re-architect billing end-to-end: per-second usage metering at platform scale, idempotent payment processing that survives provider outages without double-charging, and credit, prepayment, and enterprise-invoicing models that hold up under audit.
- Stand up a fraud-detection service that scores signups and deployments in real time and automatically throttles abuse (crypto mining, free-tier farming, stolen cards).
- Scale our Temporal workloads to orchestrate workflows across millions of deployments.
- Build internal tooling that gives teams across Railway a trustworthy, real-time view into the systems they depend on.
About you
- An ability to autonomously lead, design, and implement backend systems where correctness, consistency, and auditability are first-class requirements.
- A track record of scaling systems — you've taken a pipeline, service, or database that was falling over and made it handle 10x, and you know which tools to reach for (and when polling stops being enough).
- Deep expertise in Postgres and relational data modeling — you reach for the right consistency guarantees, understand the cost of getting them wrong, and know how Postgres itself behaves at scale.
- Strong working knowledge of Node.js internals — the event loop, memory behavior, and what to do when a service degrades under load.
- Experience managing complex asynchronous and long-running backend jobs, ideally with a workflow engine like Temporal, for things like billing runs or payment reconciliation.
- Familiarity with the realities of money movement: payment providers, idempotency, retries, reconciliation, and their failure modes. Direct billing, payments, or fraud experience is a strong plus.
- A security and abuse-aware mindset — you instinctively think about how a system can be gamed, and you design accordingly.
- A desire to be a part of the entire project development process, from research gathering and planning, to implementation and monitoring.
- Great written and verbal communication skills for expressing ideas, designs, and potential solutions in a mostly-asynchronous manner.
Things to know
- We're globally distributed—and getting more so. Stuff is always happening somewhere.
- We don't expect you to be online all the time, but you'll need to be diligent about your boundaries — your end of day will overlap with someone else's start.
- We're a small, high-ownership team that cares deeply about doing exceptional work. We're scaling quickly, which means we rely on leverage—systems over coordination, judgment over process. Expect ambiguity and a fast-moving environment.
- You'll own real outcomes. That means making decisions, not just executing—and owning the success, or failure, that comes with them.
Benefits and perks
- Autonomy: We have very few meetings. Just a Monday and a Friday to go over the Company Board. We think your time is sacred, whether it's at work, or outside of work.
- Ownership: We're a company with a high ownership, high autonomy culture. We hope that you'll come in, help us, and over the course of many years do the best work of your life. When we bring you onboard, we expect you to change the company.
- Novel problems/solutions: We're a startup that's well funded, with cool problems, which lets us implement novel solutions! We abhor "busywork" and think, whether it's community, engineering, operations, etc there's always opportunity for creative and high leverage solutions.
- Growth: We want you to grow with us, but we know that talent is loaned, so when you figure out what area you want to grow in next, whether it's at Railway or outside, we'll make sure you land there.
How we hire
- A usage metering and billing pipeline that meters CPU/RAM for millions of workloads and bills accurately (you may depend on third parties such as Stripe), or
- A stream-processing system that ingests high-cardinality observability events in real time, or
How we hire
- Polling vs. stream processing, and how you avoid losing data when streaming
- Correctness under concurrency and partial failure: idempotency, retries, reconciliation, what happens when a step fails halfway through
- How you handle cardinality, and which tools you lean on and why
- The scalability of the things you depend on — what happens when Postgres becomes the bottleneck
- Interview Structure to expect (60 Minutes):Prework (submitted before your interview): Your design0–5 minutes: Introductions5–35 minutes: Walking through the design and how you'd extend it — new failure modes, 10x load, a fraud signal35–50 minutes: Noodling on technology, data modeling, and how you think about scale, money-movement, and abuse50–60 minutes: Time for you to ask your interviewers questions
- Prework (submitted before your interview): Your design
- 0–5 minutes: Introductions
- 5–35 minutes: Walking through the design and how you'd extend it — new failure modes, 10x load, a fraud signal
- 35–50 minutes: Noodling on technology, data modeling, and how you think about scale, money-movement, and abuse
- 50–60 minutes: Time for you to ask your interviewers questions
Details
- Prework (submitted before your interview): Your design
- 0–5 minutes: Introductions
- 5–35 minutes: Walking through the design and how you'd extend it — new failure modes, 10x load, a fraud signal
- 35–50 minutes: Noodling on technology, data modeling, and how you think about scale, money-movement, and abuse
- 50–60 minutes: Time for you to ask your interviewers questions
Vagas similares
Mantenha uma lista reserva.
REST, TypeScript 1 país aceito
Senior/Lead Full Stack EngineerTeamviewerVer vaga TypeScript USA
Staff Backend Engineer- Session Replay| USA| RemoteGrafana LabsVer vaga TypeScript USA
Staff Backend Engineer- Session Replay| Canada| RemoteGrafana LabsVer vaga TypeScript USA
Staff Backend Engineer- Grafana Enterprise| US| RemoteGrafana LabsVer vaga Stack
Use estas tags para comparar vagas remotas similares.
Elegibilidade de localização
Candidatos devem aplicar apenas quando o país do perfil estiver listado aqui.
Seu perfilPaís não definidoEntre para comparar seu país com esta vaga.
Fluxo de contratação
O WithMira mostra a vaga e depois envia candidatos para a aplicação da empresa.
1Confira fit da vaga, stack e elegibilidade de localização no WithMira.
2Abra a página de aplicação da empresa pelo link rastreado.
3Salve a vaga ou assine oportunidades similares antes de sair.