jackin'
RoadmapCodebase health

Workspace-Native PR Verification

Status: Open — current implementation target is the jackin-dev local PR sync CLI. The older console-native review-mode design remains useful product direction, but it is no longer part of the near-term implementation. This page is the source of truth for the immediate work: ship a developer binary that prepares a pull request checkout, isolated configuration, isolated runtime state, local binaries, and an environment file in one reclaimable bundle so PR bodies stop carrying copy-pasted setup blocks.

Problem

jackin' pull requests carry a hand-written Verify-locally block: clone or update the repo, fetch the PR branch, build the right binaries, point a run at isolated state, run a smoke command. The block is copy-paste boilerplate, drifts between PRs, and every review re-derives the same steps by hand. With one human contributor and no second reviewer, this manual loop runs on every PR. And the loop is not unique to jackin': every project developed inside a jackin' workspace has the same "check out this PR, build what it changed, run the smoke that proves it" ritual. jackin' verifying its own PRs is just the first case — the design is generic, and the per-project recipe that drives it lives in each project, not in jackin'.

Three things make the console the right home for this, anchored to the instance the operator is already working in:

  • An instance already knows its pull request. When the operator is working on a PR inside an instance, jackin-capsule — the in-container daemon — already tracks the working directory's branch and resolves it to the matching open pull request, with title and check status. The console can show those details the moment the instance is selected, so "verify this" is a natural next step from where the operator already is.
  • A PR sometimes needs a companion role branch. A change in the jackin' repo can require a matching change in a role repo. Verifying that PR means running it with a specific role pinned to a specific branch — something jackin' already knows how to do.
  • Testing jackin' itself needs clean state. Verifying a jackin' PR by running the freshly built binary must not read or mutate the operator's real configuration and state, and must not disturb the live instance they are working in. jackin' can already redirect its configuration and state to alternate locations, so an isolated verification environment is a matter of wiring, not new machinery.

Scope

First focus: jackin-dev pr sync <PR_NUMBER> and the companion helper commands. Nothing else. The command is Homebrew-installed and owns everything outside the repository checkout — choosing the PR test root, cloning or refreshing the repository, fetching the PR head, checking out a disposable local branch, preparing isolated configuration and state, building the local binary, writing the environment file, and reporting the next shell commands. Console-native review mode, recipe execution, documentation preview automation, companion role pinning, browsing pull requests from a workspace, branch targets, role-repo PRs, merge preview, coordinated multi-repo verification, and generic scripted command execution are deferred.

Phase 0 — jackin-dev local PR sync

The first implementation creates a dedicated crates/jackin-dev crate that builds a jackin-dev binary. This binary is for contributor and maintainer workflows around testing jackin' locally. It is not a replacement for the product CLI; it is the host-side setup tool that turns the current PR-body shell recipe into a stable command. The crate starts at version 0.1.0, and every future change that adds or changes shipped jackin-dev functionality bumps that crate version.

The default checkout block becomes:

jackin-dev pr sync <PR_NUMBER>
cd "$(jackin-dev pr path <PR_NUMBER>)/jackin"
source "$(jackin-dev pr path <PR_NUMBER>)/env.sh"
which jackin

jackin-dev pr sync <PR_NUMBER> replaces the repeated shell sequence that exported JACKIN_PR_TEST_DIR, created the directory, cloned the repository, fetched the pull request ref, checked out pr-<PR_NUMBER>, and ran cargo xtask pr prepare. It also fixes the current state-location bug: no generated PR verification state belongs directly under the operator's home directory or live config directory.

Each pull request gets one self-contained bundle:

~/Projects/jackin-project/test/pr-<PR_NUMBER>/
  jackin/
  env.sh
  state/
    config/
    home/

env.sh exports PATH so the freshly built checkout binary wins, JACKIN_CONFIG_DIR pointing at state/config, and JACKIN_HOME_DIR pointing at state/home. That makes the whole verification root removable with one directory delete and prevents stale ~/.config/jackin-pr-* or ~/.jackin-pr-* folders from accumulating.

sync defaults to the safe real-world profile: copy the operator's real jackin' configuration into the bundle, replace the prior copied config on each sync, and keep runtime state empty and bundle-local. If no real configuration exists, it creates an empty config directory. Clean-room verification stays available as an explicit --config blank override, but normal PR bodies stay flagless because --config copy is the default.

sync also decides expensive prep from the pull request diff. If the pull request touches crates/jackin-capsule/ or the shared protocol surface the capsule consumes, it builds and exports the PR's jackin-capsule binary into env.sh. If the pull request touches construct image inputs, it builds the local construct image and exports JACKIN_CONSTRUCT_IMAGE. Routine PR instructions stay flagless; the command explains what it auto-enabled and why.

The first command set is:

jackin-dev pr sync <PR_NUMBER>
jackin-dev pr clean <PR_NUMBER>
jackin-dev pr env <PR_NUMBER>
jackin-dev pr path <PR_NUMBER>
jackin-dev pr status <PR_NUMBER>

clean removes the PR bundle. env prints the shell commands needed to enter the synced checkout and source its environment. path prints the bundle root for scripts and docs snippets. status reports whether the local checkout exists, which head SHA it has, which PR head SHA GitHub currently reports, whether the bundle is fresh or stale, and whether env.sh plus the state directories exist.

Responsibility stays split cleanly: jackin-dev owns outside-the-repo orchestration, while cargo xtask remains for tasks that only make sense once a checkout already exists, such as construct build/publish helpers, docs/sidebar maintenance, schema checks, PTY fixture extraction, and PR body skeleton generation.

Packaging is part of this work. jackin-dev is distributed through the jackin-project/homebrew-tap repository by a separate workflow named jackin-dev, dedicated to this binary, not by jackin's normal stable release workflow or preview workflow. When source inputs for this crate change, that dedicated workflow builds a new jackin-dev artifact for the crate's current version and updates the Homebrew tap formula so operators can install or upgrade it from the tap. If a PR does not touch jackin-dev inputs, the tap entry does not need to move. Operators reviewing jackin' PRs should be able to install or update from the tap and immediately run jackin-dev pr sync without a prior source checkout.

Deferred console-native flow

The rest of this page records the richer product direction for later. It should not be treated as part of the jackin-dev implementation PR unless that work is explicitly brought back into scope.

What you'll see and do later

The console already shows a left sidebar of workspaces, each expandable to its instances, with a right detail block for whatever row is selected. Verification reuses that layout — it is not a separate screen — and adds two things: pull request details in the detail block, and a review mode for the sidebar.

Step 1 — Select an instance that's on a PR

jackin-capsule reports the branch each instance's working directory is on and resolves it to the matching open pull request. Selecting such an instance shows its pull request details in the right block: number, title, base and head branches, open/draft/merged state, and CI check status. An instance not on a PR branch simply shows no pull-request section — the review action is offered only when there is a PR to review.

┌─ jackin ───────────────────────┬─ the-architect · running ──────────────────┐
│ ▾ jackin    ~/Projects/jackin   │ branch   feature/jackin-dev-roadmap          │
│   ▸ ● the-architect  running    │                                              │
│     ○ agent-smith    stopped    │ Pull request                                 │
│ ▸ agent-smith                   │   #480  docs(roadmap): workspace-native PR…  │
│ ▸ docs-site                     │   main ← feature/jackin-dev-roadmap          │
│                                 │   OPEN · checks passing · +135 −173          │
├─────────────────────────────────┴──────────────────────────────────────────┤
│ enter attach · R review pull request · s shell · q quit                       │
└──────────────────────────────────────────────────────────────────────────┘

Step 2 — Switch to PR review

Pressing the review key on a PR-bearing instance switches the left sidebar into review mode: the workspace/instance tree is replaced by the list of verification commands available for that PR, with the commands the PR's changes touch listed first. The right block becomes the output of the running or selected command, plus the PR context. A key returns to normal navigation.

┌─ Review · #480 docs(roadmap)… ─────────────┬─ docs-build · bun run build ──────┐
│ Relevant to this PR (docs)                  │ $ bun run build                    │
│  ▸ docs-build   [build]  ● passed           │ ✓ 142 pages built in 3.1s          │
│    docs         serve    ○ idle             │ ✓ link check ok                    │
│                                             │                                    │
│ Other commands                              │ ── verifying against ──            │
│    cli          [build]  ○ idle             │ isolated checkout of #480 head     │
│    console               ○ idle             │ config: your real config (copy)    │
│    test                  ○ idle             │ role: none · cache: shared         │
├──────────────────────────────────────────────┴───────────────────────────────┤
│ enter run · a all-relevant · A all · r refresh · s shell · o open · c settings · esc back │
└──────────────────────────────────────────────────────────────────────────────┘

Step 3 — Run and verify

Select a command and run it: build, test, start the docs server, or any command the repo's recipe defines. The operator runs the focused command, every relevant command in order, or all of them; opens a shell in the checkout to poke around or run something by hand; refreshes the PR; opens any served pages (below); and adjusts the verification settings (isolation, companion role, build cache — all with sensible defaults). Run-state badges show idle, running, passed, or failed; finishing shows a pass/fail summary. Commands the diff did not touch are still listed (never silently hidden), just not pre-selected.

Shell into the checkout. A shell action drops the operator into an interactive shell at the root of the isolated checkout — the same throwaway worktree the recipe commands run in, and with the same environment those commands get: the redirected configuration and state, the freshly built binary ahead on PATH (so jackin resolves to the binary under test), the companion-role selection, and every other variable the verification defines. So anything run by hand — git log, cargo build, jackin console, a one-off script — resolves the same binary, sees the same config and state, and behaves exactly like a recipe command, and still cannot touch the operator's real setup. On exit the terminal returns to review mode.

Refresh to rebuild. Refresh re-fetches the PR's latest pushed commits and hard-resets the local checkout to the new head — the right response to a force-pushed branch — so the next build runs against the updated code. The PR details (head commit, checks) update with it. Re-running the build set after a refresh is the local rebuild loop: pull the new commits in, rebuild, re-verify, without leaving the surface.

The live instance is never touched. Selecting the instance only identifies the pull request. Running review checks that PR's head out into a separate, throwaway host-side checkout and runs the recipe there — including the shell — against an isolated configuration and state. The operator's running session keeps going undisturbed, and jackin' can verify a PR against its own code without risking the real setup. Because the review surface is rich, debug mode does not paint over it: the full firehose of every command and its output is captured to a diagnostics run file, per Debug output never reaches a rich full-screen TUI. To triage a failed command, the operator shares the run id rather than pasting scrollback.

Deferred repo recipe

What jackin' can build, run, and serve for a repo is declared in a verification recipe committed at the root of the project repo, so it versions with the code — a PR that changes how the project is built or verified updates the recipe in the same diff. There is one command shape; optional fields select behaviour rather than forking into separate kinds:

  • mark a command as part of the build set (run in order, stop on first failure);
  • give it path globs, so jackin' can tell whether a PR's changes make it relevant;
  • mark it as a long-running server that opens the PR's changed docs pages (below).

The review actions are views over this one list: "build" runs every build-set command; "run" runs any command by name; trailing arguments are appended verbatim.

# jackin.tasks.toml — verification recipe for this repo
version = 1

[[task]]
name = "cli"
run = "cargo build --bin jackin"
build = true                         # part of the build set
paths = ["src/**", "Cargo.*"]        # relevance: lit up by code changes

[[task]]
name = "docs-build"
run = "bun run build"
cwd = "docs"                         # relative to the checkout root
build = true
paths = ["docs/**"]

[[task]]
name = "console"
run = "jackin console"               # resolves to the freshly built binary, not the installed one

[[task]]
name = "test"
run = "cargo nextest run"

[[task]]
name = "docs"
run = "bun run dev"                  # long-running; kept up until the operator quits
cwd = "docs"
# marking a command as a server tells jackin' to wait for it and open the PR's changed pages:
serve = { ready = "http://localhost:3000/", route_from = { dir = "docs/content/docs", strip_suffix = ".mdx", index = "index", base = "/" } }

Diff-driven relevance. jackin' reads the PR's changed file list once and matches it against each command's path globs. The matching commands are the PR's relevant set — surfaced first and run by "all relevant". Globs choose which command is relevant (the code build vs the docs build vs the smoke), not which source files a build recompiles — the compiler and the docs bundler already detect file-level changes inside a command. A manifest or lockfile edit lights up every code command; a docs edit lights up the docs build and the server. Non-matching commands are shown but not auto-selected; "all" runs everything; an empty or unknown change set falls back to treating everything as relevant. Nothing is silently skipped — what was de-selected, and why, is logged.

When jackin' verifies itself, a command that launches jackin' (a console smoke, a load) must run the freshly built binary, not the operator's installed one; the runner arranges that, and the launched binary's own runs are sandboxed by the chosen isolation profile.

This recipe is self-contained — each command carries its own invocation, so a repo is verifiable without any other tooling. It is related to other task-shaped surfaces on the roadmap (operator hotkeys, a future task-source abstraction) but deliberately stays its own thing: a maintainer verification recipe committed in the repo under test, with a different audience and different fields from an operator's in-console hotkeys. A shared resolver is worth revisiting only once task source abstraction defines a concrete shape. It is not a versioned jackin' schema: it lives in the repo under test, jackin' is pre-release, and a breaking shape change simply fails to parse.

Isolation: verifying jackin' without risking your setup

jackin' keeps two things separate: its configuration (workspaces, mounts, auth, trust) and its state (running instances, caches, role checkouts, agent homes). A verification run can redirect both to throwaway locations, chosen in the review-mode settings:

  • Against my real config (default) — verify against the operator's real configuration, with no state. jackin' takes a copy of the real configuration as read-only input and starts with empty state. Every write the built binary makes — schema migrations, "last role", trust-on-first-use, new instance state — lands in the throwaway copy and is discarded when the verification's disk is reclaimed. The real configuration is never written; the real state is never read.
  • Clean room — both configuration and state start empty. jackin' from zero, for first-run behaviour and defaults.

The configuration is always copied, never referenced: a PR that bumps a configuration or workspace schema migrates files in place on load, and pointing a verification run at the live configuration would let it rewrite the operator's real workspace files to the PR's schema version. The copy is what makes a schema-bumping PR safe to verify.

Deferred companion role pinning

A PR that pairs a CLI change with a role change needs jackin's inner launches to use a specific role at a specific branch. The companion-role setting records that pairing: any command that launches jackin' is invoked with the chosen role and, when a branch is given, that role checked out at that branch. With no companion role chosen, commands use whatever the isolated configuration would pick by default.

Deferred documentation preview

Some PRs change documentation, and the review need is "spin up the changed docs locally and open the pages this PR touched" — the PR template's documentation walk, automated. A command marked as a server is run like any other, but jackin' treats it specially: it starts the server, waits until it responds, then derives the URLs for the pages this PR changed (from the same changed-file list used for relevance) and opens each one. If none match, it opens the site root. The server stays up until the operator quits, then is shut down cleanly. The server command, port, and URL mapping are declared in the recipe, never hardcoded in jackin' — so a change of docs engine is a one-line edit to the recipe, not a jackin' code change. (This site's own move between docs frameworks changed exactly those values — the dev-server command and port — which is what the recipe absorbs.)

What jackin' keeps per pull request

For each PR it verifies, jackin' keeps a self-contained, throwaway bundle, isolated from the operator's working repos, the live instance, and the real setup:

  • a checkout of the PR at its head commit;
  • the isolated configuration (a copy of the real one, or empty) and the empty state home for the run;
  • enough metadata (number, title, branch, base, head commit, last-refresh time, state, chosen isolation) to show the verification's freshness without re-hitting the network.

Refreshing re-fetches the PR head and resets the checkout — the right response to a force-pushed branch. Reclaiming drops a PR's bundle and reports the disk freed. A single PR cannot be verified twice at once. The exact on-disk location is intentionally left to the implementation, since how jackin' stores console state is being reworked; the guarantee to preserve is that the bundle is isolated, reclaimable, and never mixed with the operator's real config, state, repos, or running instances.

Host-side guarantees

Per jackin's rule that it must never mutate the host machine silently, verification is operator-invoked, and its writes stay confined:

  • it writes only inside its own throwaway verification bundles and the role checkout cache — never the operator's working repos, never the live instance, never the real configuration or state;
  • the real configuration is copied, never written, so even a schema-bumping PR cannot migrate the operator's live workspace files;
  • detecting the instance's pull request, reading the PR's diff, and reading the host's GitHub auth are all read-only — verification never rewrites host git config or auth;
  • the saved workspace is never modified.

Risks, edge cases, failure modes

  • Instance not on a PR branch — no pull-request section in the detail block and no review action; the common "just working, no PR yet" case stays uncluttered.
  • PR lookup slow or failing — the detail block shows the branch with a loading or "no open PR" note; it never blocks instance selection. (jackin-capsule already caches the lookup per branch.)
  • GitHub CLI missing or unauthenticated — detection degrades to "branch only"; review, when entered, shows a one-line install/login hint instead of failing opaquely.
  • Which binary a command runs — when verifying jackin' itself, commands must run the freshly built binary, or a console smoke silently tests the wrong one.
  • Build cache — shared is fast but can in rare cases bleed state between PRs; the Fresh setting forces a cold build, and which was used is logged.
  • Fork, force-push, closed/merged — checkout uses the PR head ref directly; refresh resets hard; a closed/merged PR is flagged but still reviewable for post-merge inspection.
  • No recipe in the repo — review mode opens with a one-line "no recipe in this repo" notice and no runnable commands.
  • Server lifecycle — a server command is a child that must be torn down on quit, with its whole process group signalled so workers don't orphan, and a readiness timeout so a server that never binds fails loudly instead of hanging.
  • Shell / interactive child — the checkout shell hands the terminal over while it runs and must restore the review surface cleanly on exit; it inherits the full command environment (redirected configuration and state, built binary on PATH, companion role, any defined variables) so a hand-run command matches a recipe command exactly and defaults to the isolated bundle, not the operator's real setup.
  • Doneness — review shows a pass/fail summary and badges a failed command . A server command is interactive, not pass/fail, so it does not count toward the verdict.

Deferred

Explicitly out of the jackin-dev pr sync implementation, recorded so the design stays coherent when later slices land:

  • Console-native review mode. The instance-anchored review surface, sidebar mode switch, command badges, shell handoff from the TUI, and rich command output view are deferred.
  • Repo recipe execution. Reading jackin.tasks.toml, resolving build sets, running named commands, and diff-driven command relevance are deferred.
  • Documentation preview automation. Starting the docs server, waiting for readiness, deriving changed page URLs, opening local docs pages, and cleaning up server processes are deferred.
  • Companion role pinning. Pinning a role and optional branch while verifying a CLI PR is deferred.
  • Browsing a repo's PRs with no instance. jackin-dev pr sync starts from an explicit PR number. Picking any open PR on a workspace's repo is a later slice.
  • Scripted recipe execution beyond sync. jackin-dev pr sync is the first non-interactive entry, but it only prepares the checkout bundle. Running arbitrary verification recipes non-interactively for CI or scripts remains deferred until the command runner core exists.
  • Branches as a standalone target. PRs only for now.
  • Role-repo PRs as a standalone target. Today roles enter only as the companion pin; verifying a role repo's own PR end-to-end is the same machinery and can be added later.
  • Merge preview — verify the post-merge-into-base state rather than the PR head.
  • Coordinated multi-repo verification — one verify spanning several of the workspace's repos (the PR template's "Related pull requests").
  • In-container agent review — launch a container against the PR checkout and let an agent review in the multi-project context. The host-side model covers the daily "does this PR build and run" need; agent-assisted review is additive.

Relationship to existing roadmap items

  • Console agent session control — review is anchored to an instance and shares the console's instance-selection layout; it reads the instance's pull-request context but runs its commands host-side, not inside the instance.
  • Custom operator tools and task source abstraction — related task surfaces, but the verification recipe stays a distinct repo-committed thing; a shared resolver is worth revisiting only once task source abstraction defines a concrete shape.
  • Operator handler system — the cross-platform "open this URL" capability the documentation preview needs; resolve it there once rather than in more than one place.

If a future implementation needs to persist verification settings in the workspace configuration, that touches a versioned schema and pulls in jackin's schema-migration requirements. The design keeps verify state out of the workspace configuration (repo-committed recipe plus throwaway per-PR bundles) so the first slices need no schema bump.

Prior art

No existing tool covers the host-side-verify + companion-role-branch + isolated-jackin'-state combination, reached from the instance the operator is working in; the closest pieces would have to be glued by hand:

  • gh pr checkout — checkout by number; no build/run recipe, no isolation.
  • gh-worktree — checks a PR out into a worktree; worktree only.
  • HighReview — local PR review via worktree with IDE/AI analysis; diff-review focus, no configurable build/run tasks.
  • gh-review — terminal diff/comment/approve UI; review surface, not build/verify.
  • Nx / Turborepo / Bazel / Buck2 — affected-target selective rebuild; prior art for diff-driven relevance, not PR-scoped review.
  • just / mise — repo-committed task runners; tasks but no PR checkout or isolation.

The near-term novel piece is a shipped developer binary that prepares a host-side PR checkout with isolated jackin' configuration and state so jackin' can verify itself safely. The later console-native flow adds instance anchoring, recipe execution, documentation preview, and companion-role-branch pinning.

Implementation now

Add crates/jackin-dev at version 0.1.0; ship its jackin-dev binary through the Homebrew tap with a source-filtered workflow named jackin-dev; implement jackin-dev pr sync, clean, env, path, and status; replace PR-body checkout boilerplate with the sync flow; store checkout, copied config, empty state home, and env file under one isolated, reclaimable PR bundle; auto-enable capsule and construct prep from the pull request diff.

Future phases

  • Phase 1 — PR details in the instance view. Surface the pull request jackin-capsule already detects for an instance's branch in the right detail block: number, title, base/head, state, checks.
  • Phase 2 — Review mode + run-all baseline. The sidebar mode switch on a PR-bearing instance; the command list; check the PR head out host-side; run the whole build set against the freshly built binary, with a pass/fail summary.
  • Phase 3 — In-surface verbs + relevance (the diff-driven spine). Build / run-named / refresh / shell-into-checkout from review mode; refresh re-fetches the PR head and resets the checkout so a rebuild picks up new commits; map the PR's changed files to the relevant commands and list them first; suggestion-not-skip; de-selection logging.
  • Phase 4 — Documentation preview. The server command: start it, wait for ready, derive and open the PR's changed pages, tear down cleanly on quit.
  • Phase 5 — Isolation profiles. The review-mode setting "against my real config" vs "clean room".
  • Phase 6 — Companion role pinning. The review-mode setting of a role and optional branch, threaded into inner jackin' launches.
  • Later — the items under "Deferred".

On this page