ResearchToken Optimization ResearchTools
tools/ — runnable measurement instruments
tools/ — runnable measurement instruments
The dossier's measurements were made with the real Anthropic tokenizer and this machine's real session billing. These scripts make every such number reproducible without copy-pasting snippets out of the report prose.
| Script | What it measures | Example |
|---|---|---|
count_tokens.py | Real token count of any text / file / labeled sample set | python3 count_tokens.py samples reg.json |
image_tokens.py | Visual-token cost of images by size, across model families | python3 image_tokens.py 280x280 2000x2000 |
session_cost.py | Token-class + dollar decomposition of a session transcript | python3 session_cost.py |
Auth (read-only, secret-safe)
All three call the free POST /v1/messages/count_tokens endpoint, authenticated with the Claude
Code OAuth credential already on the machine (~/.claude/.credentials.json → claudeAiOauth.accessToken,
scope user:inference). The token is read at runtime and never printed. count_tokens bills no
inference, so re-running these is free. No ANTHROPIC_API_KEY is required or used.
Note: count_tokens rejects claude-fable-5 (HTTP 404, "use Opus 4.8"); Fable 5 and Opus 4.8 share
a tokenizer, so measure the Fable family on claude-opus-4-8.
Two traps these encode
- Tokenizer envelope ≈ 6–7 tokens per message (a 1-char message counts 7). Subtract it when comparing tiny strings; negligible for files.
- Transcript usage must be deduplicated by
message.id. Claude Code repeats the sameusageobject on every JSONL line of one API response (up to ~6 lines), so naively summing lines overcounts spend ~3×.session_cost.pydedups first.
count_tokens.py modes
count_tokens.py text <label> "<string>" # one string
count_tokens.py file <label> <path> # a file's contents
count_tokens.py samples <file.json> # [{"label","text"},...] -> TSV: label, tokens, chars, bytes, tok/100char