Month 1

Sequenced for one operator; the governing split is automatic beats disciplined — anything that depends on remembering loses to anything that is a default. Each item names its automation vehicle (hook / skill / plugin / settings / jackin') or is flagged DISCIPLINE with its decay risk.

TL;DR

Day 1 is configuration, not behavior: fix the double-registered hooks, pin subagent models, add two CLAUDE.md guard-lines, stop mid-session model switches. ~1 hour, no quality risk.
Week 1 instruments before it optimizes: OTel token dashboard + the harness screening run, then effort tiering and register choice with canaries, because those two carry the quality risk.
Month 1 moves the savings into infrastructure: context editing via SDK paths, routing flips validated per task class, and the jackin' token-pack so every launched container is born optimized — discipline converted to defaults at launch.rs / RoleManifest / CapsuleConfig.
Standing rule: re-audit after every Claude Code release — the fixed prefix once grew ~70k tokens in 5 days (GH #45188); platform drift dwarfs micro-optimizations.

Day 1 (≈1 hour, all CLAUDE-CODE-TODAY, zero quality risk)

Action	Vehicle	Effect (file)
Deduplicate caveman hooks (plugin registers them; remove the manual copies in `~/.claude/settings.json`)	settings	−966 tok/session-start, −118 tok/prompt (02 §3)
Pin exploration subagents: `model: haiku` (or `sonnet`) + `effort: low` in agent frontmatter	agent files	fan-out ÷13 effective (16)
Add two CLAUDE.md lines: "Prefer Edit over Write on existing files; never rewrite a file to change a few lines" and "Reference `file:line` instead of restating code; no post-edit summaries beyond one line"	CLAUDE.md (cached at 0.1x)	−20% visible output (15 §2–3)
Adopt session habits: model/effort chosen before the session, not mid-way; `/clear` between unrelated tasks; let auto-compact work	DISCIPLINE (decays — move to jackin' pack in month 1)	avoided cache busts ≈ $0.43/each (16, 13)
If you pinned effort max: drop to high	settings	overthinking ↓, cost ↓ (15 §1, T1)
Verify MCP servers stay deferred (ToolSearch) and remove unused servers entirely	settings	keeps 95.8% schema cut (12, 02 §4)

Explicitly not Day 1: register compression beyond what you already run, effort below high, any context-clearing automation — those need Week 1's instrumentation first.

Week 1 (instrument → screen → tier)

Stand up measurement (01 §4): CLAUDE_CODE_ENABLE_TELEMETRY=1, OTLP →ny collector, panel on claude_code.token.usage by type and agent.name, plus claude_code.cost.usage. Without this, every later claim about your own savings is folklore. (Hook/infra; hours.)
Run the harness screening arm (31): n=12 tasks + 6 canaries on your current setup — this is the baseline every later change is judged against. (Script; ~an evening of wall-clock, mostly unattended.)
Effort tiering with canaries (15 §1): /effort medium for routine work, high for hard tasks; re-run canaries; record the thinking-share delta from your dashboard (closes the biggest open measurement in the dossier). DISCIPLINE until month 1 bakes per-task defaults.
Register decision (10/02): caveman-ultra ≈58.5% of visible prose at real caveat risk — keep, or swap to plain terse-instruction (15 §3) which captures most of the win with less misread risk. Decide on canary results, not vibes. Skip wenyan: no token advantage over ultra (02 §5).
Structured-output sidecars (15 §5): any scripted extraction/classification moves to schema-constrained SDK calls — one stable schema per thread (cache pitfall).

Month 1 (SDK + infrastructure tier)

Context editing on long-running SDK/headless flows (18/14): clear_tool_uses_20250919 with clear_at_least sized above the re-write cost; assert via applied_edits and cache fields. Claude Code interactive already auto-compacts; don't double-manage.
Routing flip, validated: advisor pattern first (T1-backed, NEGATIVE-COST), then Sonnet-main + escalation per task class that passes n=30 confirmation (30 §3 — the 5–6.6x lives or dies here).
Batch lane for nightly sweeps (audits, doc regen, triage): Batch API + caching stack (reads at 0.05x effective; 13/18).
The jackin' token-pack — discipline becomes infrastructure (20 §jackin; Explore-mapped insertion points, this repo):

env defaults into the container env assembly at crates/jackin-runtime/src/runtime/launch.rs:590-717 (effort tier, telemetry on, attribution header policy);
[token_policy] block in jackin.role.toml (RoleManifest, crates/jackin-core/src/manifest.rs) → per-role model/effort/register defaults, hooks installed via the existing runtime_setup.rs symlink pattern;
CapsuleConfig (crates/jackin-protocol/src/lib.rs:25-45) → /jackin/run/agent.toml → build_agent_command so every spawned agent gets model/effort flags without operator action;
OTLP plumbing already propagates (launch.rs:694-723) — extend the existing traces with the token dashboard so each launched capsule reports its class-split spend.
Add the compression-hygiene linter (20): CI fails when always-loaded context (root AGENTS.md chain, role files) grows past a token budget — measured with the free count_tokens endpoint in CI.

Re-audit cadence: pin a monthly (and post-release) re-run of 02's baseline audit script — prefix size, MCP schema mass, hook payloads. Platform drift is the biggest unmanaged variable (12 §TL;DR, GH #45188).

Automatic vs disciplined — the full split

Automatic (set once)	Vehicle
Hook dedup, MCP defer, subagent model/effort pins, CLAUDE.md guards, output register, OTel, batch lane, token-pack defaults, CI token-budget linter	settings / frontmatter / plugin / jackin'

Disciplined (decays without automation)	Decay risk → mitigation
No mid-session model/effort switches	high → jackin' picks per-task model at launch
`/clear` between tasks; task-boundary routing	medium → orchestrator session-per-task
Edit-not-Write adherence	low (model follows CLAUDE.md) → keep the guard-line
Effort tier per task	high → role-file defaults + advisor escalation
Canary re-runs after each stack change	high → schedule it (cron) or it will not happen

Verification ledger

Claim	Basis
Hook duplication numbers	02 §3
Fan-out ÷13, advisor numbers, 9-turn switch break-even	16, accessed/derived
Edit/restatement savings	15
Context-editing mechanics + cache trade	18/14, platform docs
jackin' insertion points	local Explore agent code-map of this repo
GH #45188 prefix growth	github.com/anthropics/claude-code/issues/45188
Stack multiples referenced	30-composed-stacks.md arithmetic

32 — Adoption Roadmap: Day 1 / Week 1 / Month 1

32 — Adoption Roadmap: Day 1 / Week 1 / Month 1

TL;DR

Day 1 (≈1 hour, all CLAUDE-CODE-TODAY, zero quality risk)

Week 1 (instrument → screen → tier)

Month 1 (SDK + infrastructure tier)

Automatic vs disciplined — the full split

Verification ledger

On this page