Claude Code Cost Management: Hard Session Budgets for AI Coding Agents

Hard budget limits for
AI coding agent sessions

A local proxy that sits in front of Claude Code and Codex CLI. When a session hits your spend cap, the next request is blocked — not warned about after the fact.

Python 3.8+ · no dependencies Claude Code · Codex CLI MIT license v0.1

View on GitHub Quick start

The problem

Workspace spend limits apply monthly, at the org level. They don't stop one developer from burning $40 in a runaway session. They don't give you per-session attribution. They don't let you set different caps for different agents or tasks.

The Anthropic console and Claude.ai subscription limits tell you what you spent. They don't stop you while it's happening.

agent-bill-guard fills that gap: a per-session circuit breaker you run locally, in front of your agent. No infrastructure. No accounts. No monthly cost.

What it looks like

# Start the proxy

$ python abg.py proxy

[abg] listening on http://127.0.0.1:8788

[abg] session budget: $5.00 daily budget: $20.00

# In another terminal: point Claude Code at the proxy

$ ANTHROPIC_BASE_URL=http://127.0.0.1:8788 claude

# As your session runs...

[abg] ALLOW session=default model=claude-sonnet-4-6 cost=$0.0184 session_total=$1.23

[abg] ALLOW session=default model=claude-sonnet-4-6 cost=$0.0219 session_total=$2.67

[abg] WARN session=default cost=$0.0291 session_total=$4.01 | Budget warning: session=$4.01/5.00

[abg] BLOCKED session=default | Session budget exhausted: $5.14 >= $5.00

5-minute setup

$ git clone https://github.com/paprika-org/agent-bill-guard

$ cd agent-bill-guard

$ cp agentbillguard.yaml.example agentbillguard.yaml

# edit: set session_budget_usd and daily_budget_usd

$ python abg.py proxy

How it works

The proxy intercepts every request to Anthropic or OpenAI, checks the running spend total for the session, and either allows or blocks the request before it hits the upstream API.

After each request, it parses the response's usage field (input_tokens, output_tokens), estimates cost against a built-in model pricing table, and appends one line to ledger.jsonl.

# ledger.jsonl — one line per request

{"ts":"2026-04-19T14:22:01Z","session_id":"default","model":"claude-sonnet-4-6",

"input_tokens":4821,"output_tokens":312,"cost_usd":0.019125,"session_total_usd":4.12,

"action":"allow"}

Check spend while running

$ curl http://127.0.0.1:8788/abg/status

{

"sessions": {"default": 4.12, "bugfix-login": 1.22},

"today_total_usd": 5.34

}

Comparison

Feature	agent-bill-guard	LiteLLM	Portkey	Provider console
Per-session hard cap	✓	✗ (per-key)	✗ (per-key)	✗ (monthly)
Local, zero infra	✓	✗ requires server	✗ requires server	N/A
Per-request JSONL log	✓	✗ dashboard only	✗ dashboard only	✗
Setup time	~2 min	~30 min	~30 min	immediate
Multi-model routing	✗	✓	✓	✗
Team auth / virtual keys	✗	✓	✓	✓

Use LiteLLM or Portkey if you need org-wide routing, virtual keys, or a shared team proxy. Use agent-bill-guard if you need a per-session kill switch you can run locally in under 2 minutes.

Configuration

# agentbillguard.yaml

session_budget_usd: 5.0 # cap per coding session

daily_budget_usd: 20.0 # cap across all sessions per day

warn_at: 0.8 # warn at 80% of budget

block_on_limit: true # false = warn-only mode

port: 8788

ledger_file: ledger.jsonl

agent-bill-guard

Hard budget limits for
AI coding agent sessions

The problem

What it looks like

5-minute setup

How it works

Check spend while running

Comparison

Configuration

From the blog

Tell us how you're using agent-bill-guard

Hard budget limits forAI coding agent sessions

The problem

What it looks like

5-minute setup

How it works

Check spend while running

Comparison

Configuration

From the blog

Tell us how you're using agent-bill-guard

Hard budget limits for
AI coding agent sessions