agent-bill-guard

Hard budget limits for
AI coding agent sessions

A local proxy that sits in front of Claude Code and Codex CLI. When a session hits your spend cap, the next request is blocked — not warned about after the fact.

Python 3.8+ · no dependencies Claude Code · Codex CLI MIT license v0.1

The problem

Workspace spend limits apply monthly, at the org level. They don't stop one developer from burning $40 in a runaway session. They don't give you per-session attribution. They don't let you set different caps for different agents or tasks.

The Anthropic console and Claude.ai subscription limits tell you what you spent. They don't stop you while it's happening.

agent-bill-guard fills that gap: a per-session circuit breaker you run locally, in front of your agent. No infrastructure. No accounts. No monthly cost.

What it looks like

# Start the proxy
$ python abg.py proxy
[abg] listening on http://127.0.0.1:8788
[abg] session budget: $5.00 daily budget: $20.00

# In another terminal: point Claude Code at the proxy
$ ANTHROPIC_BASE_URL=http://127.0.0.1:8788 claude

# As your session runs...
[abg] ALLOW session=default model=claude-sonnet-4-6 cost=$0.0184 session_total=$1.23
[abg] ALLOW session=default model=claude-sonnet-4-6 cost=$0.0219 session_total=$2.67
[abg] WARN session=default cost=$0.0291 session_total=$4.01 | Budget warning: session=$4.01/5.00
[abg] BLOCKED session=default | Session budget exhausted: $5.14 >= $5.00

5-minute setup

$ git clone https://github.com/paprika-org/agent-bill-guard
$ cd agent-bill-guard
$ cp agentbillguard.yaml.example agentbillguard.yaml
# edit: set session_budget_usd and daily_budget_usd
$ python abg.py proxy

How it works

The proxy intercepts every request to Anthropic or OpenAI, checks the running spend total for the session, and either allows or blocks the request before it hits the upstream API.

After each request, it parses the response's usage field (input_tokens, output_tokens), estimates cost against a built-in model pricing table, and appends one line to ledger.jsonl.

# ledger.jsonl — one line per request
{"ts":"2026-04-19T14:22:01Z","session_id":"default","model":"claude-sonnet-4-6",
"input_tokens":4821,"output_tokens":312,"cost_usd":0.019125,"session_total_usd":4.12,
"action":"allow"}

Check spend while running

$ curl http://127.0.0.1:8788/abg/status
{
"sessions": {"default": 4.12, "bugfix-login": 1.22},
"today_total_usd": 5.34
}

Comparison

Feature agent-bill-guard LiteLLM Portkey Provider console
Per-session hard cap ✗ (per-key) ✗ (per-key) ✗ (monthly)
Local, zero infra ✗ requires server ✗ requires server N/A
Per-request JSONL log ✗ dashboard only ✗ dashboard only
Setup time ~2 min ~30 min ~30 min immediate
Multi-model routing
Team auth / virtual keys

Use LiteLLM or Portkey if you need org-wide routing, virtual keys, or a shared team proxy. Use agent-bill-guard if you need a per-session kill switch you can run locally in under 2 minutes.

Configuration

# agentbillguard.yaml
session_budget_usd: 5.0 # cap per coding session
daily_budget_usd: 20.0 # cap across all sessions per day
warn_at: 0.8 # warn at 80% of budget
block_on_limit: true # false = warn-only mode
port: 8788
ledger_file: ledger.jsonl

From the blog

My Claude Code session burned $47 in 12 minutes (and how I stopped it) — the incident that led to this tool, and why built-in warnings aren't the same as a hard stop.

Claude Code Hooks: What They Can (and Can't) Do for Cost Control — hooks are great for behavioral governance; here's why you still need a proxy for spend tracking.

Setting a hard spending limit for Claude Code sessions — account-level limits don't protect you from a single runaway session; here's how to enforce a per-session cap locally.

Claude Code Token Usage: How to See Tokens During a Session — hooks don't expose token counts; here's how a local proxy reads usage.input_tokens from the raw API response.

Claude Code Rate Limits: What They Are and How to Stay Under Them — agentic loops and growing session history hit rate limits faster than expected; here's how live token tracking helps you stay under them deliberately.

Tell us how you're using agent-bill-guard

Takes 30 seconds. Helps us build the right things.

Capping Claude Code on my laptop Protecting a CI/CD Codex pipeline Guarding a production Codex agent Tracking token usage across sessions

Or open a GitHub issue instead.