tools/ — runnable measurement instruments

The dossier's measurements were made with the real Anthropic tokenizer and this machine's real session billing. These scripts make every such number reproducible without copy-pasting snippets out of the report prose.

Script	What it measures	Example
`count_tokens.py`	Real token count of any text / file / labeled sample set	`python3 count_tokens.py samples reg.json`
`image_tokens.py`	Visual-token cost of images by size, across model families	`python3 image_tokens.py 280x280 2000x2000`
`session_cost.py`	Token-class + dollar decomposition of a session transcript	`python3 session_cost.py`

Auth (read-only, secret-safe)

All three call the free POST /v1/messages/count_tokens endpoint, authenticated with the Claude Code OAuth credential already on the machine (~/.claude/.credentials.json → claudeAiOauth.accessToken, scope user:inference). The token is read at runtime and never printed. count_tokens bills no inference, so re-running these is free. No ANTHROPIC_API_KEY is required or used.

Note: count_tokens rejects claude-fable-5 (HTTP 404, "use Opus 4.8"); Fable 5 and Opus 4.8 share a tokenizer, so measure the Fable family on claude-opus-4-8.

Two traps these encode

Tokenizer envelope ≈ 6–7 tokens per message (a 1-char message counts 7). Subtract it when comparing tiny strings; negligible for files.
Transcript usage must be deduplicated by message.id. Claude Code repeats the same usage object on every JSONL line of one API response (up to ~6 lines), so naively summing lines overcounts spend ~3×. session_cost.py dedups first.

count_tokens.py modes

count_tokens.py text &lt;label&gt; "&lt;string&gt;" # one string
count_tokens.py file &lt;label&gt; &lt;path&gt; # a file's contents
count_tokens.py samples &lt;file.json&gt; # [{"label","text"},...] -> TSV: label, tokens, chars, bytes, tok/100char

tools/ — runnable measurement instruments

tools/ — runnable measurement instruments

Auth (read-only, secret-safe)

Two traps these encode

count_tokens.py modes

On this page