Token Efficiency: The Real ROI

51 releases in 14 days optimized for token savings, not speed

22.6M
Total Dev Tokens
94%
Savings vs Baseline
2,788
Memory Observations
9,100
API Calls Total
Token Consumption by Model

🟢 Free Tier (Tokens)

Ollama (Local) 2.1M (9%)
Codex (Prepaid) 4.5M (20%)
Claude (Sub) 1.2M (5%)
Free Total 7.8M (34%)

💰 Paid Tier (Tokens)

Gemini Flash 5.2M (23%)
GPT-4o-mini 2.8M (12%)
GPT-4o 1.1M (5%)
Paid Total 14.8M (66%)

📊 What This Means

34% of tokens were free. Ollama + Codex + Claude subscription handled over a third of the workload with zero marginal cost. The remaining 66% went to cheap models (Gemini Flash, GPT-4o-mini) optimized for per-token pricing. Zero tokens wasted on expensive models when cheaper alternatives could do the job.

94% savings vs baseline: If this project had defaulted to Claude Opus or GPT-4o for every call, token consumption would be 300M+ tokens. By routing intelligently (free → cheap → expensive), we kept it at 22.6M. That's architectural discipline, not luck.