Skip to content

Guide

How to reduce your Claude Code cost

Most Claude Code spend is context you pay to re-send. A few habits cut it sharply.

Lean on the cache

A cache read costs 0.1× the input rate. Keeping a stable context prefix so it can be cached — instead of reshuffling it every turn — is the single biggest lever. Reused context is nearly free; re-sent context is full price.

Right-size the model

Haiku is a fraction of Opus per token. For routine edits and searches, a smaller model often does the job at a tenth of the cost. Reserve the largest model for the hard reasoning.

Watch the cache-write duration

A 1-hour cache write costs 2× input; a 5-minute write costs 1.25×. Long-lived caches you do not reuse are wasted 2× writes. JP the Cat shows the 5-minute vs 1-hour split so you can see where it is going.

Common questions

What is the cheapest way to run Claude Code?
Cache aggressively, keep context tight, and use the smallest model that works — cache reads at 0.1× input dominate a well-tuned session.

See your real numbers with JP the Cat

A light, local menu-bar meter for Claude Code and Codex — real cost, honest limits.

Download Meow

One tiny note from JP

Want JP email updates?

Get release notes, launch notes, and the occasional useful purr. No spam, no fluff, unsubscribe anytime.

Email is optional. Enjoy!