Claude Token Orchestrator
Under-the-hood mechanics of `jackin workspace claude-token` — PTY capture, 1Password write, config wiring, validation
This page is for contributors. Operators do not need to read it to use jackin workspace claude-token; the command reference and the authentication guide cover the operator-visible surface.
What lives here: the load-bearing details and deliberate hacks that keep the operator's view simple. Read it before changing the orchestrator's pipeline, the PTY redactor, the 1Password write contract, the post-write validation, or the config-persistence ordering — each of those decisions encodes a non-obvious safety property.
What the orchestrator does
jackin workspace claude-token glues five primitives together to take a workspace from "no token configured" to "OAuth-token mode active and end-to-end validated", with the operator never seeing the token value:
-
Probe — verify the upstream
claudeCLI is onPATHand capture its version. Seehost_claude::probe_claude_cliincrates/jackin-env/src/host_claude.rs. -
Capture — drive
claude setup-tokeninteractively under a PTY. The operator completes the OAuth flow in their browser; the long-lived token is captured intosecrecy::SecretString. Every line written to the operator's stderr passes through a redactor that swaps the token span for<redacted>so the URL and prompts still display, but the token never echoes. Seehost_claude::capture_setup_tokenincrates/jackin-env/src/host_claude.rs. -
Write — push the token into a new 1Password item via
op item create -. The JSON template (item title, category, tags,notesPlainprovenance stamp, and the field value) lands on the child's stdin; the secret never crosses argv. SeeOpWriteRunner::item_createincrates/jackin-env/src/resolve.rs. -
Validate — re-read the just-written value through the same
op://reference and compare its SHA-256 prefix against what was captured. A vault-routing surprise would otherwise leave a wired-but-broken slot pointing at an item the operator never minted. On mismatch the orphan is best-effort deleted and the run aborts with no on-disk config changed. -
Persist — only after validation succeeds, the workspace's
[claude].auth_forward = "oauth_token"and[env].CLAUDE_CODE_OAUTH_TOKEN = "op://..."are written throughConfigEditor(comment-preserving). An expiry stamp is then cached locally so the launch banner can surface "expires in N days".
This file documents each step's load-bearing details, the failure modes, and the test seams.
Module map
| What | File |
|---|---|
| PTY capture, ANSI redactor, version probe | crates/jackin-env/src/host_claude.rs |
Orchestrator state machines (run_setup / run_revoke / run_doctor) | crates/jackin-env/src/token_setup.rs |
OpWriteRunner trait + OpCli impl (1P CLI driver) | crates/jackin-env/src/resolve.rs |
CLI dispatch (handle_claude_token, rotate cleanup) | crates/jackin/src/app.rs |
clap subcommand surface | crates/jackin/src/cli/workspace.rs |
| Container provisioning of the OAuth-token onboarding skeleton | crates/jackin-runtime/src/instance/auth.rs |
Launch-time mount selection per forward_auth | crates/jackin-runtime/src/runtime/launch.rs |
| Expiry banner formatter | crates/jackin/src/tui.rs |
PTY capture
claude setup-token is interactive: it opens a browser for OAuth consent and reads keystrokes (paste codes, ENTER, Ctrl-C). It also queries the parent terminal via DA1 / XTVERSION escape sequences, expecting cooked-mode line buffering to be off so the responses flow back into its stdin. Capturing its output through a plain Command::new(...).output() pipe breaks every one of those contracts.
The orchestrator therefore runs claude setup-token under a pseudo-terminal:
- A PTY pair is allocated via
portable-pty. The child sees a real terminal on stdin / stdout. - The host's terminal is switched into raw mode for the lifetime of the capture so individual keystrokes reach the child byte-for-byte. A
RawModeGuardRAII type restores cooked mode on drop — including on panic via stack unwind. The operator's terminal must not be left in raw mode after a crash, so theDropimpl logs a recovery hint (stty sane) on failure rather than swallowing the error. - A worker thread pumps the operator's stdin into the PTY master. A real read error mid-flow (BrokenPipe, EIO, terminal detach) surfaces a
[jackin] warning: stdin pump terminated mid-flow: …before the thread exits — without that notice, claude appears to hang while the operator's keystrokes are dropped on the floor. - The parent reads from the PTY master and forwards each
\n-terminated line through a redactor (next section). Tail bytes without a trailing newline are flushed when the child exits.
A PTY read error mid-capture kills the child, drains stderr, and bails with "any captured token must be considered compromised; re-run setup". The token may have been partially emitted into the operator's view; treating it as live would be unsafe.
Token redaction on the way to stderr
The redactor in forward_redacted_line scans each line for the sk-ant-oat01- prefix (TOKEN_PREFIX). On match it captures the token bytes — alphanumerics or hyphens — into Option<String>, then writes the line to stderr with the matched span replaced by the literal text <redacted>. The redactor:
- Only captures the first token. Subsequent matches are still redacted in the operator-visible output but do not overwrite the captured value. This prevents a "header line announces the token + body line repeats it" upstream re-design from silently swapping which value gets stored.
- Walks past ANSI / VT escape sequences inside the token. The upstream CLI splits the 108-character token across two visual rows using cursor-down / cursor-position CSIs; a naive stop-at-first-control redactor captured 79 characters and produced "API Error: 401" at next launch. The hand-rolled
skip_ansi_escapehandles CSI (\x1b[), OSC (\x1b]), and bare two-byte escapes — enough for the cursor-movement sequences upstream emits between the prefix and the token body. DCS / SOS / PM / APC are not handled; if upstream ever uses them, swap tovte(the file's hand-roll comment names it as the canonical alternative). - Operates on bytes, not strings. PTY chunks may arrive mid-UTF-8-codepoint; we line-buffer into a
Vec<u8>and only forward complete\n-terminated lines, so a partial codepoint can never reach the prefix scanner mid-byte.
The captured token is wrapped in secrecy::SecretString before returning. SecretString's Debug impl prints "[REDACTED]", so even an inadvertent tracing::debug!("{secret:?}") cannot leak the value.
1Password write contract
The token never crosses argv. OpWriteRunner::item_create serialises the item template — title, category, tags, notesPlain provenance stamp, and the field value — to a single JSON payload, then spawns op item create --vault <id> --format json - and writes the payload on the child's stdin. The trailing - tells upstream op to read the template from stdin.
Reasoning: argv is visible via /proc/<pid>/cmdline to any local unprivileged reader; the op item create --field value=<token> form would expose the token for the lifetime of the op process. Stdin is not visible the same way.
In-place field edit — item_field_set
OpWriteRunner::item_field_set(item_id, vault_id, field_id, field_label, value, section) writes a new token value into a specific field of an existing 1Password item without touching any other fields or metadata. The pure JSON transform lives in the testable apply_field_edit helper. It follows the same never-on-argv contract as item_create via a two-step GET → piped-template-EDIT sequence:
- GET —
op item get <id> --vault <vault> --format jsonfetches the full item as aserde_json::Value. The raw bytes are decoded as generic JSON (not into a typed struct) so all unrecognised properties are preserved verbatim. - Modify in-process — when
field_idisSome(overwriting a field the picker resolved), thefieldsarray is matched on that exact op id, the field'svalueis replaced, itstypeis set toCONCEALED, and its existingsectionis left untouched — overwriting a value must never re-parent the field, and a same-labeled field in another section is never clobbered. Whenfield_idisNone(appending), the array is matched byfield_label; if no match a new field object ({ id, label, type: "CONCEALED", value }) is appended, placed insectionwhen one is supplied (its slug is added to the item's top-levelsectionsarray if absent). Section placement and registration happen only on the append path. - piped-template-EDIT — the modified JSON is serialised and piped to stdin of
op item edit <item_id> --vault <vault> --format json.op item editdocuments thiscat updated.json | op item edit <item>form: the item is named by the positional id, and the full JSON template is read from stdin (the--templateflag is the file-based alternative and is mutually exclusive with piped input, so it is not used). The token value therefore rides in stdin, never inargvor/proc/<pid>/cmdline. The positional MUST be the real item id — passing-makesoptreat-as the item name (the stdin sentinel is anop item createconvention, notop item edit). - Reference extraction — the JSON returned by
op item editis scanned for the target field by exactfield_id(overwrite) or by label (case-insensitive, append); the resultingop://reference is built from UUIDs (vaultid, itemid, fieldid) so it stays stable across renames.
This method is used by the --interactive setup path when the operator selects an existing 1Password item: the same OAuth token capture flow runs, but instead of creating a fresh item via item_create, the captured value is written in-place via item_field_set, leaving all other fields and item metadata untouched.
Rotation (the day-to-day jackin workspace claude-token rotate path) continues to use item_create + item_delete — a new item is always created, validated, wired, and only then the prior item is deleted. item_field_set is only for the interactive adopt-existing-item case where the operator explicitly chose a specific field to overwrite.
Stdout safety on parse
op item create echoes the created item back as JSON, including a fields[*].value for every field. Stripping the secret from a free-form JSON walk is fragile, so the orchestrator deserialises into RawCreatedItem — a struct that deliberately omits the value field. Serde-tolerant of unknown fields, the secret is discarded at the deserialisation boundary; the rest of the code path only ever sees ids, labels, and references.
If the JSON shape ever drifts (upstream renames a field, returns empty fields), the error message lists labels / ids only and points the operator at "the item may have been created but its layout is unrecognised; inspect or delete by hand". The fallback never names the value.
Account pinning
OpCli::with_account(Some(id)) pins every subprocess invocation to op --account <id>. The orchestrator constructs one OpCli per run via the op_cli_for_scope(config, scope, explicit) -> OpCli helper which folds the rule "explicit --op-account flag wins over the account recorded on the scope's stored op:// reference (OpRef::account)". The same helper is used by every run_* entry point, so account resolution lives in one place. Its interactive flag picks the timeout: the write paths (run_setup, mint_token_value, rotate) pass true for OpCli::new_interactive() (a 5-minute budget) because the op item create / op item edit write may block on a biometric or SSO unlock the operator completes in a browser; read-only run_doctor and run_revoke pass false for the default 30-second ceiling so a locked or stalled op fails fast instead of hanging a quick health check. item_create / item_field_set stamp the run's pinned account onto the OpRef they return, so the wired env value records the account it was created under; reads later rebind to that per-ref account via OpRunner::read_with_account.
OpWriteRunner::item_delete accepts a per-call account override that wins over the pinned OpCli::account — used by the orphan-cleanup path so a stale account context never causes a delete to land in the wrong vault.
Post-write validation and OrphanCleanup
This is the load-bearing safety net. After a successful item_create, the orchestrator does not persist any config. Instead it re-reads the item through the same op:// reference the writer returned, computes a SHA-256 prefix of the resolved value, and compares it against the prefix of the captured token.
If the comparison succeeds, persistence proceeds. If it fails (or the read itself errors), the orchestrator must do two things:
- Leave config untouched. A wired slot pointing at an item the operator never minted would silently inject a mystery token at the next launch. The persistence step is gated behind the validation result for exactly this reason.
- Best-effort delete the orphan. The just-created 1P item is live; abandoning it would accumulate dangling secrets in the operator's vault.
The cleanup attempt's outcome is folded into the bail message via the OrphanCleanup enum:
enum OrphanCleanup {
Deleted,
UnparseableRef { op: String },
DeleteFailed { err: String, hint: String },
}OrphanCleanup implements Display. The bail message templates use a single ". {cleanup}" join; the enum's Display impl emits a self-contained sentence per variant:
Deleted— "The just-created 1P item was deleted."UnparseableRef— "Orphan was NOT deleted: op-ref<op>did not parse into vault/item ids; remove the freshly-created item by hand from 1Password."DeleteFailed— "The just-created 1P item was NOT deleted (<err>); remove by hand:<hint>." where<hint>is the exactop item delete <id> --vault <vault>recovery command produced byOpReferenceParts::manual_delete_hint.
The only constructor is OrphanCleanup::run(op_writer, &op_ref, account). Parse failure short-circuits before any delete attempt, so DeleteFailed is structurally unreachable when UnparseableRef would also be.
Config persistence ordering
Once validation has succeeded, the orchestrator opens a ConfigEditor and applies, in order:
set_workspace_auth_forward(workspace, Agent::Claude, AuthForwardMode::OAuthToken).set_env_var(EnvScope::Workspace(ws), CLAUDE_OAUTH_TOKEN_ENV, EnvValue::OpRef(op_ref))— theop_refalready carries itsaccount, so no separate workspace-level account write is needed.editor.save()— single atomic write back to disk.
ConfigEditor preserves the surrounding TOML's comments and key ordering, so the operator's hand-edits survive a setup run.
The editor is opened after validation succeeds. A failure between item-create and editor-open leaves the 1P item live but no config wired; re-running setup is safe because validation re-creates a fresh item rather than mutating the orphan.
Mint without persist (console generate path)
Everything up to and including validation — capture, item create, post-write read-back, and the expiry stamp — lives in mint_token_value_with_runner, which returns the wired EnvValue (an OpRef for the op paths, a Plain literal for --plain) without opening ConfigEditor. run_setup_with_runner calls it and then does the persist block above, so the CLI path is mint + persist in one step. The console G generate flow instead calls the production mint_token_value entry, which mints + validates but writes no config: the minted value is staged into the open Edit-auth dialog (re-mounted with focus on Save by the same helpers the provide flow uses) and persisted only when the operator Saves through the editor's normal save path. The op-item create and read-back validation deliberately live in the mint-only function — that safety is not tied to the config write — and the expiry stamp is written there too (the token was minted, so issuance is known; the stamp is a harmless cache even if the operator later cancels the Save).
The CLI's config env unset and the TUI's auth panel both refuse to delete CLAUDE_CODE_OAUTH_TOKEN while auth_forward = "oauth_token" is active; the only supported clear path is jackin workspace claude-token revoke, which switches both keys atomically.
OAuthToken provisioning inside the container
When the launcher prepares the role-state directory for an agent whose effective auth_forward is oauth_token, provision_claude_auth (in crates/jackin-runtime/src/instance/auth.rs) takes a different shape from the other modes:
Sync— copy host~/.claude.jsontoaccount.json, write host credentials tocredentials.json,forward_auth = true.OAuthToken— remove any priorcredentials.json(revokes forwarded creds from a previous Sync run) and write{"hasCompletedOnboarding":true}toaccount.json,forward_auth = true. Without that skeleton, the in-container Claude CLI shows its "Select login method" prompt even whenCLAUDE_CODE_OAUTH_TOKENis set in env.ApiKey/Ignore— wipe both files,forward_auth = false.
agent_mounts then bind-mounts account.json (and credentials.json when present) into the container under /jackin/claude/. The per-file exists() guard keeps a stale credentials.json out of the container if a prior provision-step removal failed silently — defence in depth against a credential file surviving the mode switch.
Expiry stamp cache
The orchestrator stamps a YYYY-MM-DD file under <cache_dir>/claude-token-expiry/<workspace> after a successful setup or rotate. The launch banner reads the stamp via expiry_days_for_launch and renders an "expires in N days" suffix on the auth-mode notice; the suffix's colour follows the days-remaining count (red ≤ 7, yellow ≤ 30, dim otherwise).
The function returns Result<Option<i64>> precisely so a malformed stamp surfaces a one-shot warning to the operator instead of silently degrading to "no expiry known". The launch site explicitly matches the Err arm rather than .ok().flatten()-ing it — collapsing the error variant defeats the design.
The --reuse setup path does not write a stamp. jackin did not mint the token in that flow, so the issuance date is unknown and any stamp would mislead the operator.
revoke removes the stamp so the launch banner stops showing a countdown for a workspace whose managed token source is gone.
Revoke
run_revoke(paths, config, workspace, delete_op_item):
- Read the prior
CLAUDE_CODE_OAUTH_TOKENslot from the workspace. - If
delete_op_item == true, the prior slot must hold a parseableop://reference. If it holds a literal token or an unparseable URI, bail with an explicit error — the operator asked for a 1P-side delete and a silent no-op would let the secret survive in the vault. The--delete-op-itemflag is never honoured implicitly. - Issue
op item delete <item> --vault <vault>via the pinnedOpCli. - Open
ConfigEditor, removeCLAUDE_CODE_OAUTH_TOKENfrom the workspace's env block, set the workspace's Claudeauth_forward = ignore,save(). - Clear the cached expiry stamp.
item_delete failure propagates before editor.save runs, so the workspace config is unchanged. A re-run of revoke is safe once the underlying issue (auth, permission) is fixed.
Rotate
rotate is setup + delete_prior_op_item:
- Read the prior slot for the scope being rotated (workspace-level, or the per-role slot when
--roleis passed) viaprior_token_slot, so a role-scoped token wired bysetup --roleis found and its item is the one reused/deleted. A flag-supplied--roleis validated against the workspace's allowed roles first. If the slot holds anop://reference, default--vaultto the prior item's vault — without this, the documentedrotate <ws>form would hard-error insidecreate_op_itemafter the operator completes the PTY token capture. Seevault_for_rotatefor the precedence rule. - Run
run_setupend-to-end. Validation, config persistence, and the new expiry stamp all complete first. delete_prior_op_item(prior, &report.op_ref, account)— parses the priorop://, then only deletes an item jackin created. It callsitem_tagson the prior item and proceeds withitem_deleteonly when the item carries theJACKIN_TAG. An item the operator adopted (--reuseor interactive edit-in-place) has no jackin tag and may hold the operator's other fields, so it is left untouched with a printed note; a tag-read error also fails safe (skip the delete) rather than risk destroying a shared item.- If the delete fails, the rotate exits non-zero with a copy-pasteable
op item delete <id> --vault <vault>recovery command. The new item is wired and live; the orphan needs hand-removal. - The same-ref guard (
prior_ref.op == new_ref.op) prevents rotate from deleting the new item it just created if a deeper bug ever causes them to match. The guard's eprintln tells the operator the situation is unexpected and to rundoctorto verify.
Doctor
run_doctor is a structural / connectivity check — it does not contact Claude's API. The cheapest reliable way to confirm an OAuth token is valid upstream is to launch a workspace and observe the auth banner; doctor's job is to confirm the managed workspace env slot resolves without errors:
- Read the workspace's
CLAUDE_CODE_OAUTH_TOKEN. Missing slot → actionable "run setup first" error. - If the slot holds an
op://reference, resolve throughop read. The resolution failure is wrapped with the resolved path so the operator's terminal output matches what they see in 1P. - SHA-256-prefix the resolved value and emit it in the report so the operator can confirm the slot points at the item they expect.
Doctor is the right tool to run after the launch banner says "API Error: 401 Unauthorized": doctor will return Ok if the slot plumbs cleanly, which means the token itself is invalid upstream (rotated externally, manually revoked); doctor will return Err if jackin' wiring is the problem.
Test injection seams
Every entry point that talks to the world (claude setup-token, op, the host filesystem) is split into a thin entry point and a _with_runner variant that takes injected runners:
| Entry point | _with_runner variant | Injected dependencies |
|---|---|---|
run_setup (CLI: mint + persist) | run_setup_with_runner | Option<&ClaudeProbe>, capture closure, &dyn OpRunner, &dyn OpWriteRunner |
mint_token_value (console: mint only) | mint_token_value_with_runner | Option<&ClaudeProbe>, capture closure, &dyn OpRunner, &dyn OpWriteRunner |
run_revoke | run_revoke_with_runner | &dyn OpWriteRunner |
run_doctor | run_doctor_with_runner | &dyn OpRunner |
delete_prior_op_item (rotate cleanup) | delete_prior_op_item_with_runner | &dyn OpWriteRunner |
The unit tests inside mod tests of crates/jackin-env/src/token_setup.rs spawn no claude, no op, no real PTY. They use FakeOpReader and FakeOpWriter (records every item_create / item_delete call, optional with_failing_delete() to exercise the DeleteFailed arm). Pre-resolved ClaudeProbe fixtures stand in for the host CLI probe.
The post-write SHA-mismatch and read-failure paths are the most load-bearing safety net in the orchestrator and are covered with the strictest assertions: error-message substring, no on-disk config change, exactly one cleanup-delete fired against the canonical UUIDs, no expiry stamp written.
When extending the orchestrator, prefer to extend an existing _with_runner shim and add a fake-injected test rather than plumbing a new global. The runners and their fakes are the project's main lever for keeping the unit-test suite hermetic.
Hacks and load-bearing details, summarised
- PTY raw mode + RAII guard —
claude setup-tokenneeds byte-for-byte stdin and unbuffered stdout for DA1/XTVERSION responses; cooked mode breaks the contract.RawModeGuardrestores cooked mode on Drop and surfaces a recovery hint on restore-failure. - ANSI escape skipper inside the token — upstream splits the 108-char token across two visual rows with cursor-position CSIs. A naive control-stop redactor captures 79 chars and produces 401s at next launch.
- Stdin-only secret pass to
op item create -— argv is visible via/proc/<pid>/cmdlineto local unprivileged readers; stdin is not. RawCreatedItemdeliberately omitsvalue— the JSON echo fromop item createcarries the secret back. Discarding it at the deserialisation boundary is more robust than scrubbing later.- Post-write SHA round-trip — vault-routing surprises (item landed in the wrong vault, upstream
opschema drift) must never leave a wired-but-broken slot. The validation read + prefix comparison is the safety net. OrphanCleanupenum +Display— the cleanup-attempt outcome rides into the bail message as a structured value, not string concatenation. The "every arm starts with a leading space" implicit contract from the prior closure form is gone.- Config persisted last — a partial failure earlier must never leave a wired-but-broken slot. The editor open + save sequence is gated behind the validation result.
- OAuthToken onboarding skeleton — without
{"hasCompletedOnboarding":true}inaccount.json, the in-container Claude CLI ignoresCLAUDE_CODE_OAUTH_TOKENand shows the login wizard. The skeleton is jackin-managed and bind-mounted; it is the only file mounted into the container under OAuth-token mode. expiry_days_for_launchreturnsResult<Option<i64>>— splitting "absent stamp" (the normal case) from "malformed stamp" (the should-warn case) so a corrupt cache surfaces once on the next launch instead of silently disappearing the countdown.- Revoke
--delete-op-itemis hard-error on literals — the operator opted into a 1P-side delete; a silent fall-through when the managed env slot can't be deleted would let the secret survive. - Same-ref guard in rotate — protects against deleting the freshly-created item if the new and prior
op://references ever match.
When changing any of the above, update this page in the same PR.