Infrastructure
Open-source models have closed the capability gap — and the compression is accelerating.
MMLU Benchmark Score (5-shot) — Closed vs. Open Frontier Models
88%
85%
91%
93%
Open models in rust · Closed models in charcoal
The margin that justified closed-model API pricing has effectively collapsed. Qwen2.5-Max's February release — scoring 93.0 on MMLU — isn't a near-miss; it's a tie, achieved at inference costs that undercut OpenAI's o1 by a reported 97%. The practical implication for anyone building on closed APIs is not theoretical: the switching cost just became the only remaining moat, and switching costs erode.
What's less discussed is the compression dynamic underneath the headline numbers. Each open-source release is now arriving with full weight files, fine-tuning recipes, and distillation pipelines — meaning the next generation of smaller, faster, cheaper models is already baking in the oven. The capability gap isn't closing; it's inverting. The question for infrastructure teams in Q2 is not whether to evaluate open weights, but which evaluation framework to trust.
"The switching cost just became the only remaining moat — and switching costs erode."
Source: Qwen2.5 Technical Report, arXiv:2501.12599 · Feb 2026
