My Claude Code session burned $47 in 12 minutes (and how I stopped it)

I opened my laptop after lunch, saw an Anthropic billing email, and had that specific sinking feeling you get when you already know it’s bad before you read the number. One Claude Code session had chewed through $47 in about 12 minutes. I wasn’t doing anything exotic. I’d pointed it at a large monorepo, asked it to trace a nasty issue across services, and somewhere along the way it got stuck in a loop with a huge context window. That was it.

The setup was painfully normal. Big repo. Lots of files. A prompt that sounded reasonable. Claude kept revisiting the same areas, re-reading code, trying slightly different approaches, and dragging around something like 200k tokens of context while doing it. I had another terminal open. I wasn’t watching usage in real time because, honestly, who stares at a token meter while debugging? I noticed when the bill showed up.

If you searched for claude code cost or claude code runaway because this happened to you too, yes, this is that post. Not a theoretical concern. Not budgeting advice from someone who never got burned. I got burned.

Why the built-in limits didn’t save me

The obvious response is: shouldn’t Claude Code already have some limits? Sort of. It has usage visibility. It has some controls. What it does not give me is the thing I actually wanted in that moment: a hard stop before more money leaves my account.

That distinction matters. A warning is not a stop. A dashboard is not a stop. Telling me usage after the expensive calls already happened is definitely not a stop.

When an agent gets into a bad loop, cost ramps faster than you expect. Especially on a large codebase. A few heavy requests turn into a dozen. The context grows. Retries happen. Tool calls stack up. You tell yourself it’s probably making progress. Then it isn’t. Then your inbox tells you what that assumption cost.

I’m not the only one. There’s a widely shared thread about a $350 Claude Code bill, which is exactly the kind of story that made me stop treating this as user error and start treating it as an engineering problem. See: github.com/anthropics/claude-code/issues/50589.

That thread was weirdly comforting and also infuriating. Comforting because it meant I wasn’t uniquely careless. Infuriating because it confirmed the failure mode is real: agentic coding tools can keep spending while you’re busy doing actual work.

So I built the thing I wanted

I ended up building agent-bill-guard because I wanted the control point outside the CLI. Not another dashboard tab. Not another promise that I’d check usage more often. I wanted a dumb, reliable gate in front of model traffic with an actual budget limit.

That’s what it is: a local proxy that sits between your coding agent and the model API. It tracks spend and cuts off requests when the configured budget is hit. If a session starts running hot because the agent is looping on a monorepo or ballooning context, it stops there instead of politely continuing to bill you.

I built it for Claude Code, but the reason I care also applies to the broader category. If you’re looking for a codex cli budget control, the same principle holds: don’t rely on the client to be the only thing enforcing cost discipline. Put a meter and a kill switch in the path.

# Clone and set up (Python stdlib only, no dependencies)
git clone https://github.com/paprika-org/agent-bill-guard
cd agent-bill-guard
cp agentbillguard.yaml.example agentbillguard.yaml
# Edit agentbillguard.yaml: set session_budget_usd to whatever hurts

# Start the local budget proxy
python abg.py proxy
# [abg] listening on http://127.0.0.1:8788
# [abg] session budget: $5.00  daily budget: $20.00

# In another terminal, route Claude Code through it
ANTHROPIC_BASE_URL=http://127.0.0.1:8788 claude

That snippet is the core idea. Start the proxy. Set a budget. Route Claude Code through it. If the session tries to blow past the limit, it gets blocked. Annoying? Maybe, in the same way a circuit breaker is annoying when it trips. Expensive surprises are more annoying.

What it does not magically solve

I don’t want to oversell this, because the limitations are real.

First, it’s a local proxy, not a hosted service. That means you run it yourself. If you want something fully managed with centralized dashboards and policy controls across a team, this isn’t that.

Second, you have to remember to route traffic through it. If you launch Claude Code directly against the API without the proxy in the middle, obviously it can’t protect you. I built this because I wanted a simple guardrail on my own machine, not because I solved every enterprise spend problem.

Third, it can stop runaway cost, but it can’t make a bad prompt good. If you send an agent into a giant repo with vague instructions and it thrashes for ten minutes before the budget cap hits, you still lost ten minutes. The point is that you didn’t also lose another $80.

That’s the tradeoff I care about. I can live with interrupted sessions. I can’t justify invisible spend spikes as the normal cost of using coding agents.

My Claude Code session burned $47 in 12 minutes (and how I stopped it)

Why the built-in limits didn’t save me

So I built the thing I wanted

What it does not magically solve

Related posts

Related posts