Guide

How to reduce your Claude Code cost

Most Claude Code spend is context you pay to re-send. A few habits cut it sharply.

Lean on the cache

A cache read costs 0.1× the input rate. Keeping a stable context prefix so it can be cached — instead of reshuffling it every turn — is the single biggest lever. Reused context is nearly free; re-sent context is full price.

Right-size the model

Haiku is a fraction of Opus per token. For routine edits and searches, a smaller model often does the job at a tenth of the cost. Reserve the largest model for the hard reasoning.

Watch the cache-write duration

A 1-hour cache write costs 2× input; a 5-minute write costs 1.25×. Long-lived caches you do not reuse are wasted 2× writes. JP the Cat shows the 5-minute vs 1-hour split so you can see where it is going.

Common questions

What is the cheapest way to run Claude Code?: Cache aggressively, keep context tight, and use the smallest model that works — cache reads at 0.1× input dominate a well-tuned session.

See your real numbers with JP the Cat

A light, local menu-bar meter for Claude Code and Codex — real cost, honest limits.

Download Meow