# tools/ — runnable measurement instruments (https://jackin.tailrocks.com/research/token-optimization/tools/)


# tools/ — runnable measurement instruments [#tools--runnable-measurement-instruments]

The dossier's measurements were made with the real Anthropic tokenizer and this machine's real
session billing. These scripts make every such number reproducible without copy-pasting snippets out
of the report prose.

| Script            | What it measures                                           | Example                                     |
| ----------------- | ---------------------------------------------------------- | ------------------------------------------- |
| `count_tokens.py` | Real token count of any text / file / labeled sample set   | `python3 count_tokens.py samples reg.json`  |
| `image_tokens.py` | Visual-token cost of images by size, across model families | `python3 image_tokens.py 280x280 2000x2000` |
| `session_cost.py` | Token-class + dollar decomposition of a session transcript | `python3 session_cost.py`                   |

## Auth (read-only, secret-safe) [#auth-read-only-secret-safe]

All three call the free `POST /v1/messages/count_tokens` endpoint, authenticated with the Claude
Code OAuth credential already on the machine (`~/.claude/.credentials.json` → `claudeAiOauth.accessToken`,
scope `user:inference`). &#x2A;*The token is read at runtime and never printed.** `count_tokens` bills no
inference, so re-running these is free. No `ANTHROPIC_API_KEY` is required or used.

Note: `count_tokens` rejects `claude-fable-5` (HTTP 404, "use Opus 4.8"); Fable 5 and Opus 4.8 share
a tokenizer, so measure the Fable family on `claude-opus-4-8`.

## Two traps these encode [#two-traps-these-encode]

* **Tokenizer envelope ≈ 6–7 tokens** per message (a 1-char message counts 7). Subtract it when
  comparing tiny strings; negligible for files.
* **Transcript usage must be deduplicated by `message.id`.** Claude Code repeats the same
  `usage` object on every JSONL line of one API response (up to \~6 lines), so naively summing lines
  overcounts spend \~3×. `session_cost.py` dedups first.

## count\_tokens.py modes [#count_tokenspy-modes]

```
count_tokens.py text &lt;label&gt; "&lt;string&gt;" # one string
count_tokens.py file &lt;label&gt; &lt;path&gt; # a file's contents
count_tokens.py samples &lt;file.json&gt; # [{"label","text"},...] -> TSV: label, tokens, chars, bytes, tok/100char
```