Skip to content

Team

An R&D-led organization.

The research lab and the runtime engineering team share a manager. Backgrounds span PhD-level acceleration research, distributed systems, hardware co-design, and enterprise security. The same engineers who ship the runtime answer the on-call page.

Composition

Where the headcount lives.

Research lab

Inference acceleration, long-context engines, mixed-precision, tiered KV-cache. Publishes at NeurIPS, ICML, MLSys. Several members are co-appointed faculty at partner universities.

~30% of headcount

Runtime engineering

The scheduler, the inference engine, the multi-tenant control plane. CUDA, ROCm, NCCL, Triton, vLLM internals. The team that ships papers as products.

~25% of headcount

Cluster & SRE

Bare-metal provisioning, InfiniBand fabric, BGP at the edge, capacity planning across seven regions. Built and operates the production fleet on NVIDIA / Dell / Supermicro hardware in Cologix facilities.

~20% of headcount

Customer engineering

Named cluster engineers paired to reserved-tier accounts. Distributed training, fine-tuning, inference performance tuning. Lives in customer Slacks; pages with the same on-call rotation as runtime.

~15% of headcount

Trust & compliance

SOC 2 Type II, ISO 27001, HIPAA. Maintains the trust center, runs audits, owns the security posture. Backgrounds in cloud security and regulated-industry infrastructure.

~5% of headcount

GTM & operations

Reserved-capacity sales, partner program, finance, legal. Built around the assumption that customers buy clouds for years — every contract is written to be renewable, not negotiated.

~5% of headcount

Operating principle

Two rules of the org chart.

Researchers and runtime engineers share a manager.

Every paper has a runtime owner from week one. Every shipped optimization has a researcher who co-authored the underlying work. The two functions read each other's pull requests.

The on-call rotation is one rotation.

Customer engineering, runtime, cluster, and SRE share a single P1 page. Customers reach the people who shipped the code that broke. There is no support tier gated from engineering.

Hiring

Open roles.

We hire deliberately. Most roles are senior. Most are research-adjacent. We post a small number of openings, for a long time, until we find the right person — not the next person.

See open positions

Working with us

Want to work here, or with us?

Customer engagements start with a one-call scoping. Career applications get a same-week reply from a hiring manager, not an inbox bot.