Developer Blog

Claude Code audit log: how to see what your agent actually did

Published April 20, 2026

I let Claude Code run for about 20 minutes on a refactor, came back, and realized I had no clean way to answer the basic question I actually cared about: what did it do? Not the summary. Not the vibe. The real sequence. Which files did it read, which commands did it run, what tools did it call, and in what order? That is the point where I started wanting a real claude code audit log, not just a chat transcript and a Git diff after the fact.

Git helps, but only at the end. It tells you what changed, not what the agent tried before it got there. Claude Code's session view helps while you are watching it live, but once the session is over the trail is thin. If you are doing solo work, maybe that is fine for a while. If you are using agents on a team, touching production scripts, or working in a repo with secrets, it stops being fine pretty fast.

I built agentcheck because I wanted something dead simple: hook into Claude Code's runtime events, log every meaningful action to a file, and keep that record around after the session disappears. No mystery. No hand waving. Just an audit trail I can grep, archive, review, and feed into other tooling later.

Why a claude code audit log is missing by default

This is not really a knock on Claude Code. It is mostly a product boundary issue. Claude Code is optimized for the interactive loop between developer and agent. You prompt, it thinks, it asks for tools, it edits files, you keep moving. That loop is fast because it is focused on the current session, not on persistent governance or historical review.

The problem is that session visibility is not the same thing as an audit trail. A real audit log needs to survive after the terminal closes. It needs structure. It needs to be machine-readable. It needs enough detail to answer cross-session questions like, "what commands did we let agents run this week?" or "when did an agent start touching files outside src/?" Claude Code does not try to be that system.

In practice, the built-in experience has three gaps if you care about review and compliance:

First, the logs are not a durable record you can depend on across sessions.

Second, there is no clean cross-session review flow for agent behavior over time.

Third, the output is not structured in a way that makes downstream analysis easy.

That last part matters more than people think. If all you have is conversational text, you can read it. You cannot reliably query it. You cannot filter by tool name, path, exit code, or timestamp without doing messy parsing work later.

What agentcheck adds to the claude code audit log story

agentcheck sits one layer above Claude Code's built-in settings.json rules. Those rules are still useful. You should keep using them. But rules alone are not observability. Blocking a command is one thing. Knowing what the agent actually attempted to do over time is a different job.

agentcheck hooks into Claude Code's PreToolUse and PostToolUse events. When the agent is about to run a command, read a file, write a file, or call another tool, agentcheck can capture that event. Then it writes a structured JSONL record to ~/.agentcheck/audit.log. One line per event. Easy to tail in another terminal. Easy to archive. Easy to ship somewhere else if you want.

It can also flag anomalies in real time. I am being deliberate with that wording. This is not some all-seeing policy engine that understands your entire organization. Right now it is more practical than fancy. You can define checks and get alerted when the agent starts doing something that looks off, like touching sensitive paths or issuing unexpected shell commands. That alone is enough to catch a lot of the "wait, why is it doing that?" moments.

{"ts":"2026-04-20T14:03:11.482Z","event":"PreToolUse","tool":"Bash","session_id":"cc_8d17","input":{"command":"find . -name '*.env'"},"cwd":"/workspace/app","status":"pending"}
{"ts":"2026-04-20T14:03:11.641Z","event":"anomaly","rule":"sensitive-file-pattern","severity":"warn","message":"Command targets possible secret files","session_id":"cc_8d17"}
{"ts":"2026-04-20T14:03:12.209Z","event":"PostToolUse","tool":"Bash","session_id":"cc_8d17","result":{"exit_code":0,"duration_ms":727},"cwd":"/workspace/app"}
{"ts":"2026-04-20T14:03:14.004Z","event":"PostToolUse","tool":"Write","session_id":"cc_8d17","input":{"path":"src/refactor/api.ts"},"result":{"bytes_written":1842}}

That is the core value. I do not want to reconstruct the story from memory. I want the story written down as it happens.

What this helps with in real developer workflows

The obvious use case is forensics after a weird run. The agent touched something it should not have, or a refactor went sideways, or a test fixture got rewritten in a way nobody expected. With a proper claude code audit log, I can trace the behavior without guessing.

It also helps with basic team trust. If multiple developers are using Claude Code in the same codebase, "what happened?" should not require screenshots and vague recollections. A JSONL log gives you a common source of truth. Not perfect truth. But good enough to review real behavior instead of debating impressions.

Then there is the compliance angle. Some teams need a record of agent activity because they are touching regulated code, internal tooling, or systems with stricter change controls. agentcheck does not solve compliance by itself. I do not want to pretend it does. But it gives you the raw runtime evidence that those workflows usually need.

What agentcheck does not do yet

I want to be straight about this. agentcheck is not a complete security boundary. It is not a sandbox. It is not magical prevention. If you need airtight isolation, you still need OS-level controls, repository permissions, network restrictions, and sane Claude Code configuration.

It also does not yet give you every possible analysis view out of the box. Right now the foundation is the persistent log and the runtime checks. That is intentional. I would rather have a boring log file that is always there than a glossy dashboard built on top of incomplete data.

Still, for the original problem, it does the job. I wanted to know what Claude Code actually did. Now I can look at ~/.agentcheck/audit.log and answer that without playing detective.

If you are looking for a practical way to add a claude code audit log to your workflow, start with agentcheck here: https://github.com/paprika-org/agentcheck.