Signal — The AI Briefing That Shapes What Happens Next

Brief 01

Infrastructure

Open-source models have closed the capability gap — and the compression is accelerating.

MMLU Benchmark Score (5-shot) — Closed vs. Open Frontier Models

GPT-4 class (closed)Jan 2024

88%

Llama 3.1 405B (open)Jul 2024

85%

DeepSeek-V3 (open)Dec 2024

91%

Qwen2.5-Max (open)Feb 2025

93%

Open models in rust · Closed models in charcoal

The margin that justified closed-model API pricing has effectively collapsed. Qwen2.5-Max's February release — scoring 93.0 on MMLU — isn't a near-miss; it's a tie, achieved at inference costs that undercut OpenAI's o1 by a reported 97%. The practical implication for anyone building on closed APIs is not theoretical: the switching cost just became the only remaining moat, and switching costs erode.

What's less discussed is the compression dynamic underneath the headline numbers. Each open-source release is now arriving with full weight files, fine-tuning recipes, and distillation pipelines — meaning the next generation of smaller, faster, cheaper models is already baking in the oven. The capability gap isn't closing; it's inverting. The question for infrastructure teams in Q2 is not whether to evaluate open weights, but which evaluation framework to trust.

"The switching cost just became the only remaining moat — and switching costs erode."

Source: Qwen2.5 Technical Report, arXiv:2501.12599 · Feb 2026

Brief 02

Economics

Inference costs have fallen 850× in three years. The unit economics of AI products are being rewritten in real time.

Cost per 1M tokens — GPT-4 class vs. Best Open Alternative ($USD, approximate)

Q1 2023

$60

Q3 2023

$30 / $22

Q1 2024

$15 / $8

Q3 2024

$5 / $2.1

Q1 2025

$2.5 / $0.4

Feb 2026

$1.1 / $0.07

Closed (GPT-4 class)

Open alternative

When DeepSeek-R1 landed in January 2025 at $0.55 per million output tokens, it wasn't a pricing anomaly — it was the announcement that the inference market had structurally bifurcated. By February 2026, Cerebras-hosted open models are processing at $0.07 per million tokens for GPT-4-class quality. The startup that built its margin model on a 70% gross margin at $0.50/1K tokens is now looking at a world where the commodity floor is 40× lower.

The downstream effect isn't just on AI-native startups. Enterprise software companies embedding LLMs into existing products — the "AI wrapper" cohort — are facing a strange problem: their differentiation was partly the cost of switching, and the cost of switching just fell through the floor. The companies surviving this compression are the ones who locked in on data flywheel advantages eighteen months ago. Everyone else is repricing their roadmap.

"The companies surviving this compression locked in data flywheel advantages eighteen months ago."

Source: SemiAnalysis, "The Inference Cost Collapse" · Jan 2026

Brief 03

Policy

The EU and US are now operating under incompatible AI legal frameworks. Compliance teams have 90 days before the first enforcement actions land.

The EU AI Act's prohibited practices provisions entered full force in February 2026. Simultaneously, the US Executive Order on AI Safety — signed under the previous administration — was partially rescinded, replacing binding compute-threshold reporting with voluntary commitments. The result: any company deploying AI systems in both jurisdictions now operates under directly contradictory obligations on transparency, explainability, and prohibited use cases. The GPAI Code of Practice deadline is March 15th. If your legal team hasn't mapped your model deployments to Article 6 risk classifications, the clock is already running.

Feb 2

EU AI Act enforcement begins

Mar 15

GPAI Code of Practice deadline

€35M

Max fine, prohibited practices

Source: EU AI Act, Article 5 — Prohibited Practices · Official Journal of the EU

The briefing yourcompetitors already read.

Open-source models have closed the capability gap — and the compression is accelerating.

Inference costs have fallen 850× in three years. The unit economics of AI products are being rewritten in real time.

The EU and US are now operating under incompatible AI legal frameworks. Compliance teams have 90 days before the first enforcement actions land.

The briefing your
competitors already read.