Blog

Engineering, methodology, and the work behind the numbers.

Long-form posts from the team that builds the platform. Each piece links to the source data, the repository, or the paper. We don't write thought-leadership.

Subscribe via RSS Open benchmarks

FeaturedEngineering

How we hit 20× throughput on DeepSeek V3 — kernel by kernel

A walk through the runtime work that took DeepSeek V3 serving from 14 tok/s to 290 tok/s on the same eight H200s. Speculative decoding, MoE expert routing, KV-cache eviction.

Ana Roy · Apr 18, 2026 · 18 min

Read post

Latest