Setting a hard spending limit for Claude Code sessions

The problem is pretty simple: Claude Code does not have a built-in session budget. Anthropic gives you usage controls, but those are account-wide and monthly. That is not the same thing as a claude code spending limit for one coding session, one task, or one afternoon of messing around with a repo.

What I wanted was much narrower. I wanted to say: this session gets $10, and after that it stops. Not "warn me later." Not "show me the monthly bill." Just stop the requests when the budget is gone. Basically a real claude code budget, enforced locally, right where the requests happen.

Important: Anthropic account limits are not per-session limits. A monthly or account-level cap does not protect you from a single Claude Code task running longer than you expected and burning through money before you notice.

This got annoying fast. You kick off a task, Claude Code keeps churning, you switch tabs or walk away, and you come back to a bill that is way higher than you meant to spend. Maybe it is $12. Maybe it is $30. Either way, the whole issue is that the spending control is too far away from the session itself.

So I built agent-bill-guard. It is a tiny local proxy, written in Python with the standard library only, that sits between Claude Code and the Anthropic API. It tracks token costs per request, shows a running cost counter in the terminal, and enforces a hard cap. When the session hits the cap, it returns an error response and blocks further API calls.

What a Claude Code spending limit actually means

When people search for something like claude code spending limit or claude code budget, I do not think they are asking how to manage their total account bill. Usually they want one of these:

Cap a single debugging session. Cap one autonomous task. Cap a risky refactor. Cap a weekend experiment. In other words, limit the blast radius of one run of the tool, not the entire billing account.

That distinction matters because the failure mode is local. The thing that hurts is not "my monthly spend is high." The thing that hurts is "this one session kept going and I did not notice in time."

How agent-bill-guard works

The setup is intentionally simple and local. Claude Code already talks to an API endpoint. Instead of pointing it straight at Anthropic, you point it at the proxy. The proxy forwards requests, reads the usage fields in responses, estimates cost, keeps a running total, and once the configured budget is exceeded, it starts rejecting further requests.

The command looks like this:

agent-bill-guard --listen 127.0.0.1:8787 --budget-usd 10

And the terminal output looks something like this while Claude Code is running:

[agent-bill-guard] listening on http://127.0.0.1:8787
[agent-bill-guard] session budget: $10.00

[request 1] input: 18234 tokens  output: 913 tokens
[request 1] estimated cost: $0.41  running total: $0.41 / $10.00

[request 2] input: 44102 tokens  output: 2207 tokens
[request 2] estimated cost: $0.97  running total: $1.38 / $10.00

[request 3] input: 128901 tokens  output: 6401 tokens
[request 3] estimated cost: $2.84  running total: $4.22 / $10.00

[request 4] input: 201338 tokens  output: 11092 tokens
[request 4] estimated cost: $4.46  running total: $8.68 / $10.00

[request 5] input: 73102 tokens  output: 3988 tokens
[request 5] estimated cost: $1.61  running total: $10.29 / $10.00

[agent-bill-guard] budget exceeded
[agent-bill-guard] blocking further API calls for this session

Tip: Set the budget a bit lower than your actual comfort limit. If you genuinely do not want to spend more than $10 on a task, configure the cap at $8 or $9 — cost estimation from token counts can drift slightly, so give yourself some headroom.

Setup

The flow is straightforward:

git clone https://github.com/paprika-org/agent-bill-guard
cd agent-bill-guard
python3 -m pip install .

agent-bill-guard --listen 127.0.0.1:8787 --budget-usd 10

Then point Claude Code at the proxy by setting ANTHROPIC_BASE_URL:

export ANTHROPIC_BASE_URL=http://127.0.0.1:8787
claude

That is the whole trick. Claude Code talks to the proxy, the proxy talks to Anthropic, and you get a hard stop when the budget runs out.

Limitations

A few honest caveats.

This does not integrate with Anthropic's billing API. It is not reading some canonical invoice from the source of truth. It estimates cost from the usage fields in API responses and the model pricing table baked into the proxy. That means it can be slightly off — usually close enough to be useful, but not accurate to the cent.

It is also per session, because that is the problem it was built to solve. If you want org-wide spend analytics, historical dashboards, or policy controls across multiple machines, this is not that. It is a local choke point that says "enough" when a single Claude Code run goes over budget.

Why this approach

Mostly because I do not trust myself to babysit cost in real time. If I am using an agentic coding tool, I am paying attention to the code, not watching token counters. A hard claude code budget is simpler. I set the number, start the session, and if the task wants more than that, it fails loudly rather than quietly spending more.

Repo: https://github.com/paprika-org/agent-bill-guard