Research lab
Distributed Training Researcher
We run a small in-house lab and publish on multi-thousand-GPU pre-training, long-context, and inference acceleration. You'll lead one to two papers per year with a co-author or two and full compute backing on B200 / B300.
The team
About the team
The research lab sits between the runtime team and the customer-facing teams. It exists to make the platform faster, cheaper, and longer-context, in writing. Every paper has at least one engineer co-author and at least one production-impact follow-up.
Reports to the lab director. Light on-call presence (one shadow rotation per year, learning-only).
The role
What you'll do
Lead one to two papers per calendar year as first or co-first author. Pick the question; we will fund the compute.
Run experiments at the 256-to-2048-GPU scale on B200 / B300 and on long-context regimes up to 1M tokens.
Translate findings into runtime PRs with the inference-acceleration team. The bar is that every paper has a follow-up that lands in customer production.
Mentor one to two early-career researchers per year, including PhD students from partner universities (Carnegie Mellon, ETH Zurich, EPFL, Princeton).
Represent the lab at major venues — NeurIPS, ICML, ICLR, MLSys, ASPLOS — and on at least one open benchmark per quarter.
Hold publication veto: we will not delay or edit your paper. You publish whenever and wherever you want.
The bar
What we're looking for
PhD in CS / ECE / applied math, or equivalent published track record (3+ first-author papers at top ML / systems venues).
Demonstrated experience designing and running experiments at thousand-GPU scale.
Strong PyTorch; comfort with one of FSDP, DeepSpeed, or Megatron-LM at production scale.
Empirical, write-up-first style. You publish negative results when they're useful.
Willingness to ship code that runs on customers. The lab is not a sandbox.
Bonus
Nice to have, not required
Existing co-appointment or close ties with a research university.
Hands-on with long-context architectures (ring / paged attention, retrieval).
Open-source maintainership on a relevant repo.
Experience supervising PhD students.
Compensation
In writing, like everything else
We publish bands. We meet them. The number you see on the offer is the same number your future peers got at the same level. We do not negotiate; we level.
$240,000 – $380,000 USD (US) / equivalent in EU.
Research-track equity grant, comparable to a senior staff engineer at the same level.
Co-appointments with partner universities are supported up to 20% time, paid pro-rata.
How to apply
One email is enough
Send a short note to careers@iframe.ai with the role title in the subject line. Include your CV or LinkedIn, one or two links to work you're proud of, and a sentence on why this role specifically. Hiring managers reply within five business days, regardless of outcome.
- 01
Application
A hiring manager reads every email. Reply within five business days.
- 02
Manager call
30–45 minutes. Scope, role, mutual fit. We share the comp band on this call.
- 03
Technical loop
3–4 sessions on the same day. Real problems, no homework, no whiteboard riddles.
- 04
Offer
Same-week offer at the published band for your level. Start dates are flexible.
Also open
Other roles you might consider
- Runtime
Inference Acceleration Engineer
Ship the next 2× on the open-source model catalog. Triton, CUDA, ROCm. You will publish what you ship.
View role - Cluster & SRE
Cluster Site Reliability Engineer
Bring up B300 racks, drive InfiniBand fabric to spec, run capacity planning across seven regions. Pager included.
View role - Customer engineering
Customer Cluster Engineer
Embedded with a small portfolio of reserved-tier accounts. Distributed training perf, NCCL, kernel tuning.
View role
One last thing
If this role isn't quite right but you'd be a fit at iframe.ai, write anyway.
Senior engineers and researchers can apply outside the listed roles. The bar is the same. The reply window is the same.