Case studies
Real teams. Real architectures. Real numbers.
Each case study below was written with the customer's engineering lead. Architectures are real, metrics are pre/post measured, and the customer reviewed the page before publish.
By sector
Browse the catalog.
- RoboticsNorthstar Robotics
Pre-training a 70B world model on 4,096 B200s with sub-1% scaling loss.
A multi-week pre-training run with bare metal, InfiniBand NDR, and atomic checkpointing. The cluster ran cleanly across two failure-domain reschedules.
- B200 GPUs
- 4,096
- Scaling loss
- 0.7%
- Failed checkpoints
- $0
- FinancialLedgerMind
Compliance-grade long-context retrieval for trillion-token discovery.
Long-context endpoints replaced a brittle RAG stack. Auditable citations back to source pages, no embedding pipeline.
- Token context
- 1.0M
- Citation fidelity
- 97%
- Faster review
- 5×
- Public sectorCivic Signals
FedRAMP-aligned isolated region for state-government workloads.
A dedicated single-tenant region with documented controls, data-residency guarantees, and a procurement path that fit a state budget cycle.
- Single-tenant
- 100%
- Controls inherited
- 9
- Procurement fit
- FY1
- Consumer AIFrame Studio
20× throughput on Llama 3.1 70B at consumer-app price points.
A consumer creative app moved off the closed-model API stack to a managed open-source endpoint and held latency under 50ms p99 at 11× lower cost.
- Throughput
- 20×
- p99 latency
- <50ms
- Lower cost
- 11×
- ResearchMIT-SAIL
Two papers, one shared cluster, $40K of credits.
An academic group used granted compute on B200 to run two ablation campaigns over a semester. Outputs published with reproducible scripts in our open repo.
- Papers
- 2
- Granted compute
- $40K
- Reproducible
- 100%
Tell us what success looks like.
Most engagements start with a one-call scoping. We will send a written plan, with milestones, in the same week.