Guide
How to reduce your Claude Code cost
Most Claude Code spend is context you pay to re-send. A few habits cut it sharply.
Lean on the cache
A cache read costs 0.1× the input rate. Keeping a stable context prefix so it can be cached — instead of reshuffling it every turn — is the single biggest lever. Reused context is nearly free; re-sent context is full price.
Right-size the model
Haiku is a fraction of Opus per token. For routine edits and searches, a smaller model often does the job at a tenth of the cost. Reserve the largest model for the hard reasoning.
Watch the cache-write duration
A 1-hour cache write costs 2× input; a 5-minute write costs 1.25×. Long-lived caches you do not reuse are wasted 2× writes. JP the Cat shows the 5-minute vs 1-hour split so you can see where it is going.
Common questions
- What is the cheapest way to run Claude Code?
- Cache aggressively, keep context tight, and use the smallest model that works — cache reads at 0.1× input dominate a well-tuned session.
See your real numbers with JP the Cat
A light, local menu-bar meter for Claude Code and Codex — real cost, honest limits.
Download Meow