51 releases in 14 days optimized for token savings, not speed
34% of tokens were free. Ollama + Codex + Claude subscription handled over a third of the workload with zero marginal cost. The remaining 66% went to cheap models (Gemini Flash, GPT-4o-mini) optimized for per-token pricing. Zero tokens wasted on expensive models when cheaper alternatives could do the job.
94% savings vs baseline: If this project had defaulted to Claude Opus or GPT-4o for every call, token consumption would be 300M+ tokens. By routing intelligently (free → cheap → expensive), we kept it at 22.6M. That's architectural discipline, not luck.