Intelligence Is Deflating: 938× Cheaper in Three Years

GPT-4-class intelligence got 938× cheaper in three years. Intelligence is deflating faster than any technology input in history.

I built a pipeline on Bonacci Studio that snapshots live AI pricing, capability benchmarks, and GPU rental costs, then asks a simple question: if you hold capability fixed, what does it cost over time?

Hold Capability Fixed

Hold it at GPT-4's intelligence, its 2023 benchmark score, and track the cheapest model that still clears that bar:

Mar 2023 GPT-4 $37.50 / M tokens

May 2024 GPT-4o $7.50

Sep 2024 Qwen2.5 72B $0.37

Aug 2025 gpt-oss-20B $0.088

Mar 2026 Qwen3.5 2B $0.04

The capability that defined the frontier in 2023 is now served by a 2-billion-parameter model at four cents per million tokens.

938×

cost reduction in 3 years

13×

price spread, same open model

54×

API cheaper than self-hosting

Three Things the Data Made Sharper

The collapse is uneven. The frontier holds its price ($3.40/M median); the efficient tier is where the floor falls out ($0.45/M). Buying the best model has never been cheap. Buying last year's best has never been cheaper.
Price is now a property of the seller, not the model. The same open model (DeepSeek V3.2) is served across 12 providers at a 13× price spread. That's commoditization happening in real time.
The API price war has out-competed your own hardware. Convert Vast.ai GPU rents into $/token and self-hosting open weights costs 9× to 54× the cheapest API. You'd need impossible (>100%) GPU utilization to break even. Inference is now cheaper than the silicon.

The Pipeline That Keeps Running

The interesting part isn't the snapshot, it's that the pipeline keeps running. Scheduled ingestion, change detection that flags the moment a provider cuts a price, an append-only store that becomes a proprietary dataset the longer it runs. History you can't buy after the fact.

Intelligence is deflating ~10× a year at constant capability. The question for anyone building on it isn't "is it getting cheaper", it's "am I capturing the deflation, or pricing as if it stopped."

What you're seeing below: the full pricing pipeline walkthrough , live model snapshot ingestion, capability crosswalk, cost-per-unit analysis, and provider spread charts. Built and run entirely in Bonacci Studio.

studio.bonacci.thinkingdbx.com, Economics of Intelligence

#AI #LLM #DataEngineering #BonacciStudio #MLOps #thinkingdbx

Intelligence Is Deflating:938× Cheaper in Three Years

Hold Capability Fixed

Three Things the Data Made Sharper

The Pipeline That Keeps Running

Intelligence Is Deflating:
938× Cheaper in Three Years