# CI/CD Speed Roadmap (https://jackin.tailrocks.com/reference/roadmap/ci-speed-roadmap/)



**Status**: Implemented baseline -- dual-runner parity proof accepted; implementation roadmap prioritized for wall-clock impact, with lint-lane de-serialization, package-matrix collapse, preview/CI overlap, and a warm persistent-runner lane leading the order. Phase 9 steps 1, 2, and 4 are implemented in this branch: `check-all-features` is removed, `clippy` owns the all-features compile gate while `check-default` owns the default-feature compile gate, and the measured GitHub-hosted CI required gate moved from about 12m33s on run `27937532691` to about 10m16s on run `27961783994`. The Docker E2E lane now lives inside the reusable nextest workflow, runs beside the package shards, and is path-routed so unrelated Rust package changes do not force the real Docker lane. Phase 1 steps 1 through 8 are also implemented: `jdx/mise-action` now pins the mise binary version with a stable cache prefix, docs CI uses `bun ci`, CI cargo tools and composite-action tools install through mise's GitHub release backend instead of the source-compiling cargo backend, scheduled hygiene records GitHub Actions cache usage, shared <RepoFile path="mise.toml" /> changes are routed by the specific tool entries they affect instead of always firing Rust or preview builds, and non-building policy/audit jobs no longer restore Rust target caches. Phase 2 now uses the middle-ground package shape from this roadmap: named heavy-crate nextest jobs plus one small-crate bucket instead of the old 19-way matrix; an archive fan-out experiment built successfully but failed an existing checkout-dependent test, so the current package matrix remains the safer and better-attributed shape for now. A same-run prepared-workspace artifact was measured and removed: run `27975505075` spent up to 4m09s downloading the prepared nextest workspace in one fan-out job, while the cache-only follow-up run `27976780904` completed the GitHub-hosted `CI` required gate in about 7m51s with no dependency download or compile markers. Run `27980183186` stayed in the same band at about 8m07s to `ci-required`: log scanning found no crates.io index update, crate download, or third-party dependency compile markers, but `cargo nextest prepare` still spent 2m40s compiling jackin' workspace crates, so the next win is reducing workspace compile/setup cost rather than restoring another large artifact. Hosted-runner GHA `sccache` was measured and rejected after later runs stayed at 0% hits with write errors; GitHub-hosted jobs now rely on `rust-cache`, the shared Cargo registry cache, and Buildx caches, while Velnor keeps local-disk `sccache` as an opt-in warm-runner accelerator. Compile-heavy Rust jobs now use semantic workspace-aware v2 `rust-cache` keys, shared by dependency/build shape rather than job name, and Cargo dependency jobs restore a separate shared Cargo registry/index/git DB cache with the same dependency-key shape; exact registry hits are verified with `cargo fetch --locked --offline`, while only true first-time misses may fetch from the network. The shared registry cache now also keys and fetches fuzz lockfiles, so fuzz lanes can run Cargo in offline mode after cache population. Measurement also removed cache work where it made the run slower instead of warmer: workflow-only `actionlint` no longer restores Cargo state, `cargo fmt` no longer restores registry or target caches because it performs no dependency resolution or compile, policy/audit lanes keep registry/advisory caches without restoring target archives that they cannot use, and nextest no longer serializes package and Docker lanes behind a cold target-cache seeding job. `cargo audit` now disables yanked-crate index checks in PR CI and uses `--no-fetch --stale` on advisory-cache hits, so hot audit lanes do not update the crates.io index or refetch RustSec data. Phase 4 steps 1, 2, 3, 4, and 5 are implemented: preview archive builds now start on `push` to `main`, publish waits for successful `CI` on the exact source SHA, preview owns the release-profile `jackin` archive build that packages `jackin-role`, source-path filtering is preserved, and the final SHA ancestry check still runs before mutating the rolling preview. Phase 5 steps 1, 2, 5, and 6 are implemented: manual Buildx GHA cache refs carry `ghtoken`/`repository`, the old cache-mount experiment is superseded, construct builds now restore or build the pinned `shellfirm` binary outside Docker before copying it into the context, x64 construct jobs install `shellfirm` from the upstream GitHub release via mise instead of compiling it from crates.io, and the Dockerfile's from-source Rust compile is gone. Phase 6 step 3 is implemented: Codebook installs through the prebuilt mise GitHub backend, the old Rust/Cargo cache work is removed from docs spell-check jobs, and the Docs timing summary captures the warm result. Phase 7 steps 1-5 are implemented: preview and release archive jobs now share the same composite build/package/sign/upload action, release archive builds run beside the release test workflow, final archives use deterministic tar/gzip metadata settings, all release targets build on Linux through `cargo-zigbuild`, archive builds isolate Zig global/local caches per job to avoid persistent-runner compiler-rt write races, preview and release archive jobs use the same target-scoped cache-key design, and scheduled hygiene keeps a native macOS smoke job. Phase 0 steps 1, 2, 4, and 5 are implemented: `CI`, `Docs`, `Construct Image`, `Publish Homebrew Preview`, and `Release` now write timing/cache summaries, short-retention Cargo timing artifacts exist for `clippy`, `check-default`, nextest prepare, preview builds, release builds, and `jackin-dev` builds, and the shared summary reports time to first red signal plus each workflow's target completion metric. The shared summary also now totals setup/cache/artifact/Docker/Cargo step time, counts and samples cache misses, dependency downloads, third-party dependency compiles, source-tool compiles, `sccache` issues, prepared-workspace artifact restores, and Velnor `job-log` artifacts so every run exposes the markers that decide whether another cache/routing iteration is required; its dependency marker scanner now tolerates ANSI-colored Cargo output. The timeout guardrail is implemented for CI, Docs, construct, preview, release, reusable nextest jobs, Renovate, scheduled hygiene jobs, and `jackin-dev`. Phase 8 lane-awareness has reached usable parity: `construct-e2e-image`, `Construct Image` build and rehearsal jobs, preview archive builds, release test/archive builds, `jackin-dev` archive builds, package nextest shards, and Docker E2E all run on the same runner-lane matrix when `lanes: both` is selected; construct, preview, release, and `jackin-dev` artifact handoffs are lane-scoped so publish jobs consume GitHub-hosted artifacts while Velnor proves parity; native construct `arm64` remains GitHub-hosted until Velnor has an arm path; local-disk `sccache` is enabled only on the opt-in Velnor lane. The optional Velnor persistent `target/` store is now scoped by trust scope, repository, workflow, and job class upstream, and Velnor job containers now receive daemon-level CPU and memory caps, so the warm lane can be optimized without becoming the default trust boundary. Renovate no longer runs on every push to `main`; it stays scheduled and available through `workflow_dispatch`.

**Latest measurement**: Runs `27984408161` and `27985175824` rejected hosted-runner GHA `sccache`: both stayed at 0% hits with cache write errors in `check-default` and nextest prepare. Run `27986139438` then proved that simply dropping `RUSTC_WRAPPER` from GitHub-hosted jobs changes the default `rust-cache` environment hash and cold-starts the target cache (`check-default` rose to 1m42s and nextest prepare to 4m31s). Runs `27986624424` and `27987160427` proved that pinning `rust-cache` `env-vars` while leaving GitHub compile-heavy jobs unwrapped is still not fastest: caches hit and dependency downloads disappeared, but `check-default` stayed at 1m55/1m47 and nextest prepare stayed at 2m30/2m07. Run `27987545345` restored the wrapper while disabling the hosted GHA compiler-cache backend and moved the hot GitHub path back into the fast band: `check-default` 57s, `clippy` 1m01s, construct E2E image 1m55s, and nextest prepare 2m19s. Run `27987976766` kept that shape and stayed in the fast band after the archive-cache alignment work: `ci-required` finished in about 7m42s, `check-default` in 42s, `clippy` in 1m13s, nextest prepare in 2m15s, and Docker E2E in 3m14s. Run `27990579448` attempt 2 proved the same warm SHA path after merging `main`: no crates.io index update, crate download, third-party compile, cache-miss, or prepared-workspace artifact markers appeared; `check-default` took 42s, `clippy` 50s, nextest prepare 2m14s, and `ci-required` finished in about 7m24s. Its step timings exposed apparent nextest fan-out setup waste, but run `27991767807` rejected the attempted removal of the separate Cargo registry restore: package lanes and Docker E2E failed under `--offline` after restoring only the shared `rust-cache` archive. Run `27992734300` verifies the restored baseline: `ci-required` completed successfully, no dependency download, third-party compile, prepared-workspace, or `sccache` issue markers appeared, `check-default` took 47s, `clippy` 1m04s, construct E2E image 1m31s, nextest prepare 2m18s, Docker E2E 3m46s, and the longest real steps were Docker E2E test execution, nextest binary build, package tests, and construct image build. The explicit Cargo registry cache is therefore required for fan-out correctness until another measured design proves otherwise. This also confirms that the old prepared-workspace artifact handoff, not the explicit registry restore, was the real large regression: run `27969864046` spent 6m04s in nextest prepare, then package/Docker fan-out jobs spent up to 1m50s downloading plus 39s restoring the prepared workspace before tests could run. Run `27993146785` kept the same green baseline with no dependency download, third-party compile, or prepared-workspace markers. Run `27993742007` proved that policy/audit lanes remain green without target-cache restores, but also exposed ANSI-colored crates.io index update markers from `cargo audit` and `cargo fuzz` that the old scanner missed; the follow-up fixes audit to skip yanked index checks, makes the fuzz lane prove offline registry availability before running, and strengthens the marker scanner. Rerun `27994748751` then proved the hot path after fuzz-registry caching: the full log scan returned `NO_MARKERS` for crates.io index updates, crate downloads, plain third-party dependency compiles, prepared-workspace restores, source-tool compiles, and `sccache` issues; `check-default` took 47s, `clippy` 56s, `audit` 23s, `fuzz` 1m39s, nextest prepare 2m20s, package lanes 59s-1m46s, Docker E2E 2m51s, and `ci-required` finished about 7m15s after the run started. Run `27996372045` kept the hot path clean after extending timing summaries to the remaining workflows: full log scan again returned `NO_MARKERS`, `check-default` took 53s, `clippy` 57s, construct E2E image 1m41s, nextest prepare 2m27s, package lanes 1m08s-1m53s, Docker E2E 2m57s, and `ci-required` finished about 7m26s after the run started. The same run evaluated the baked `jackin-ci` image idea against current hot-run step costs: cache restore consumed 414s aggregate across 54 steps, Cargo/test work 389s across 15 steps, Docker work 172s across 3 steps, while tool setup was only 102s aggregate across 40 steps with a 5s per-job high-water mark in the nextest fan-out. Run `27996815419` then compared the corrected branch against latest `main` run `27972283053`: `main` took about 14m31s and still logged crates.io index updates plus crate downloads in check, validator, nextest, and Docker lanes, while the branch took about 7m42s and the full marker scan returned `NO_MARKERS`. The remaining avoidable hot-path cost was `cargo nextest prepare`: it spent 95s building all nextest binaries even though the exact shared `rust-cache` key was already warm and the fan-out jobs restored the same cache before running their real tests. The nextest prepare job therefore now skips that seeding build on exact cache hits and only builds when the semantic cache is cold. Run `27997272508` proved the hot-cache skip: nextest prepare fell to 53s, the `Build nextest binaries` step was skipped, the full log scan again returned `NO_MARKERS`, Docker E2E took 2m51s, and `ci-required` finished about 5m39s after the run started. A `lanes=both` parity run, `27997911210`, then exposed a cold-cache correctness bug instead of a Velnor-only issue: `cargo clippy (GitHub)` missed the shared Cargo registry cache, then the cache-population step inherited `CARGO_NET_OFFLINE=true` and could not fetch `anyhow`; the shared registry cache action now forces its population step online while leaving downstream build/test commands offline. Run `27998091712` proved that the registry fix works and `prepare` skips its build on a hot exact key (38s), but it also exposed cache-key over-sharing: Docker E2E restored the non-Docker all-features cache, then still compiled `jackin-capsule` for 7s and the `docker-e2e` test graph for 32s before running the real Docker tests. Run `27998435131` rejected the attempted Docker-specific key: the cold seed took `prepare` to 4m44s and the Docker job still restored a full exact key before compiling the same local workspace crates (`jackin-capsule` 6.94s, `docker-e2e` test graph 31.64s). This matches `Swatinem/rust-cache` behavior: it is useful for dependency artifacts, but workspace crate outputs are not a dependable cross-job cache target. CI therefore keeps the shared `ci-all-features-dev-workspace-v2` key, because it gives the fastest proven hot path and avoids a second cold cache family that does not remove the Docker local relink/compile. A baked image might still help cold-starts, but it no longer addresses the measured hot-run wall-clock bottleneck; the only remaining candidate for eliminating Docker local compile without a large target restore is a measured `cargo nextest archive --archive-file` handoff plus a small `jackin-capsule` binary artifact, and it must beat the current 7s+32s local compile cost after upload/download time. The corrected GitHub-hosted baseline keeps `RUSTC_WRAPPER=sccache` for Cargo fingerprint and target-cache compatibility, disables the failing GHA compiler-cache backend with `SCCACHE_GHA_ENABLED=off`, and relies on `rust-cache`, the shared Cargo registry cache, and Buildx caches only where those caches warm real build work; preview, release, and `jackin-dev` archive jobs now use the same early-wrapper cache shape. Velnor keeps local-disk `sccache` as an opt-in accelerator.

Run `27999009032` rejected the remaining serialized nextest `prepare` gate. The run was green and had no dependency-download markers, but the shared target cache missed in `cargo nextest prepare`, so `prepare` compiled third-party and workspace crates for 5m05s, saved the cache, and only then let package lanes (`52s`-`2m03s`) and Docker E2E (`2m54s`) start; `ci-required` completed about 10m41s after the run began. The objective is fastest wall clock, not zero compilation at any cost, so CI now removes nextest `prepare`, lets package and Docker shards restore the shared `ci-all-features-dev-workspace-v2` cache directly, keeps Cargo offline through the explicit registry cache, and lets only the `jackin` shard save the shared target cache because it pulls the widest package graph.

Run `27999729168` accepted that correction. `CI` was green, `ci-required` completed about 5m14s after run start, and the full log scan returned `NO_MARKERS` for crates.io index updates, crate downloads, third-party dependency compiles, source-tool compiles, prepared-workspace restores, and `sccache` issues. Package lanes ran directly after construct-image availability and finished in `52s`-`1m58s`; Docker E2E finished in `2m51s`. The Docker lane restored the exact shared target cache and did not download dependencies, but still rebuilt local jackin' workspace crates for `jackin-capsule` (`6.65s`) and the `docker-e2e` test graph (`30.79s`), which is now the remaining measured Docker-side compile cost. The same SHA also kept related workflows green: `Docs` reached `docs-required` in about 1m16s, `Construct Image` reached `construct-required` in about 2m18s, `jackin-dev` finished its archive builds in about 1m35s, and `Renovate Validate` finished in about 17s. The current hosted GitHub default therefore remains cache-only direct fan-out; further Docker compile removal needs a timed archive or binary-artifact experiment that beats roughly 37s of local rebuild plus any upload/download time.

Run `28000359236` rejected treating Velnor as a speed win before proving the runner itself was on the current tool/image baseline. The GitHub lane stayed fast, but the optional Velnor lane became the tail: native `mise install` jobs for `cargo-audit` and dependency policy stalled on untrusted `/__w/mise.toml`, the Sentry runner still used `velnor/job-ubuntu:24.04` while Velnor source had moved to a 26.04 image with Rust 1.96.0, and the Velnor MSRV job correctly exposed that `sysinfo 0.39.x` no longer supports jackin's Rust 1.94 MSRV. The response was to fix Velnor first, not make it default: Velnor commit `080679d` sets `MISE_TRUSTED_CONFIG_PATHS=/__w` in the native mise adapter, moves the default job image to `velnor/job-ubuntu:26.04`, and was deployed to Sentry as `velnor-runner 0.1.29+trustscope.20260623.080679d` with the 26.04 job image built locally. jackin' now pins diagnostics `sysinfo` to the 0.38 line, and `cargo +1.94 check --workspace --all-targets --locked` passes locally again. The next parity proof must rerun `lanes=both` after these fixes and compare step timing against the GitHub-required lane before counting Velnor as a performance improvement.

Runs `28001389977`, `28001814662`, and `28002123889` continued the dual-runner proof and found runner-parity bugs instead of reasons to make Velnor the default. Velnor commit `0eccb8c` added cargo-backend tool bin discovery to the native mise adapter; commit `c9155c8` then removed poisoned empty mise version dirs before install and exported direct install roots so GitHub-release tools like `cargo-deny`, `cargo-shear`, and `cargo-audit` are visible to later steps. Both were deployed to Sentry, ending at `velnor-runner 0.1.29+trustscope.20260623.210000.c9155c8` with four `velnor-jackin` slots ready on `velnor/job-ubuntu:26.04`. Run `28002123889` proved the important Velnor cache/tool fixes in real CI: Velnor `cargo dependency policy`, `cargo audit`, `cargo msrv check`, `cargo check default`, `cargo clippy`, `cargo bench build`, `actionlint`, `schema-check`, construct image build, and several nextest package shards all passed quickly without the previous cargo-subcommand failures. The same run also exposed a workflow portability bug in Docker E2E: the job built `jackin-capsule` under Velnor absolute `CARGO_TARGET_DIR`, then hard-coded `target/debug/jackin-capsule`. The Docker E2E handoff now resolves `${CARGO_TARGET_DIR:-target}/debug/jackin-capsule`, preserving GitHub's default relative target path and Velnor per-job warm target store. The next proof run must verify Docker E2E on both lanes before the branch can claim full dual-runner parity.

Run `28002415026` kept the GitHub-hosted lane green, including package shards and Docker E2E, but found the remaining Velnor issues were runner/environment parity rather than cache-design failures. Velnor ran the jackin' tests as root inside the job container, so chmod-000 auth fixture tests could still read files and could not produce `EACCES`; the minimized Ubuntu job image also let `man` exit 0 while printing the system-minimized notice instead of topic text, which means the help test should assert the documented contract of successful non-empty output rather than exact `auth` prose. Docker E2E timed out after the nested `docker run` reported `Docker unavailable`: Velnor mounted the host Docker socket, but test temp directories created under container `/tmp` were not visible at the same absolute path to the host Docker daemon. Velnor commit `28a7ef7` fixes that by exposing same-absolute host-visible temp/workspace mounts and `VELNOR_DOCKER_HOST_TEMP`; it was deployed to Sentry and published to the apt repository as `velnor-runner 0.1.29+trustscope.20260623.220000.28a7ef7`, with `velnor/job-ubuntu:26.04` slots ready. The jackin' side now skips chmod-EACCES assertions when the current environment can still read the fixture, keeps the help test on the file-level zero-exit/non-empty-output contract, and sets `TMPDIR=$VELNOR_DOCKER_HOST_TEMP` for Docker E2E when Velnor exposes it. The next proof run must rerun `lanes=both`, verify Velnor Docker E2E passes, and re-scan all jobs for dependency download or compile markers before claiming Velnor parity or speed wins.

Run `28006419716` proved those test-environment fixes: Velnor `cargo nextest jackin`, `jackin-runtime`, `jackin-tui`, `jackin-capsule`, and `small-crates` all passed, so the chmod/root and minimized-manpage mismatches are resolved. Velnor compile/check jobs were also faster than GitHub on the same run (`clippy` 30s vs 51s, `check-default` 27s vs 48s, construct E2E image 47s vs 1m53s), but Docker E2E still failed before executing tests because `actions/download-artifact` reported no match for `construct-trixie-image-Velnor` even though the artifact was uploaded and later visible through the run artifacts API. The nextest reusable workflow now gives the token `actions: read` and downloads construct artifacts with explicit `github-token`, `repository`, and `run-id`, matching the documented cross-run/repository lookup path and avoiding runner-runtime artifact visibility differences. The next proof run must verify that Velnor can download its lane-scoped construct artifact and then reaches the actual nested-Docker E2E test.

Run `28006975146` proved full dual-runner parity after the explicit artifact lookup fix: `ci-required` and `timing-summary` both passed, Velnor Docker E2E downloaded the lane-scoped construct artifact and completed in `2m20s`, and GitHub Docker E2E completed in `3m33s`. Velnor also beat the GitHub-hosted lane on the compile/check/image jobs in this run (`check-default` 27s vs 42s, `clippy` 30s vs 49s, construct E2E image 47s vs 2m26s) while GitHub remained the default required lane. A combined scan of GitHub logs plus Velnor job-log artifacts found no crates.io index update markers, no crate download markers, no external-crate `Compiling ... v...` markers, and no prepared-workspace artifact restores. It did find Velnor `rust-cache` miss lines and showed `sccache` using the per-container home cache instead of the host-mounted `/var/cache/sccache`; that is a runner issue, not a workflow design win to ignore. Velnor commit `d603e1e` now defaults job containers to `SCCACHE_DIR=/var/cache/sccache`, keeps workflow env able to override it, moves Velnor-owned `cargo-nextest` and `cargo-zigbuild` installs to the GitHub release backend, and keeps `cargo-deb` on the cargo backend because the upstream GitHub release only publishes an amd64 `.deb` asset. It was deployed to Sentry and published to the apt repository as `velnor-runner 0.1.29+trustscope.20260623.230000.d603e1e`, with the public amd64 index and the installed jackin daemon both verified. The next proof run must rerun `lanes=both` on `d603e1e` and verify that Velnor `sccache` stats now report the host-mounted cache location before claiming local compiler-cache speedup.

Run `28022714137` exercised that `lanes=both` proof against Velnor `d603e1e` and found a cold-cache construct-image regression before Docker E2E could run. Both lanes missed the `shellfirm` prebuilt cache. GitHub then fell back to `cargo install shellfirm --version 0.3.10` and failed because the current CI registry/offline shape does not guarantee that source-install path; Velnor found `shellfirm` through mise but staged the mise shim into the Docker context, so the Dockerfile copied a shim whose target did not exist in the image. The fix keeps the GitHub release backend as the default install path for x64 construct E2E by installing `shellfirm` through mise in that job, and changes the construct helper to stage `mise which shellfirm`'s real binary while ignoring mise shim paths. The next proof run must rerun `lanes=both`, verify construct E2E image succeeds cold and warm on both lanes, verify Velnor `sccache` stats use `/var/cache/sccache`, and repeat the dependency/cache marker scan before claiming the local compiler-cache speedup.

Run `28023116265` proved the real-binary staging fix on Velnor (`construct E2E image (Velnor)` passed in `1m43s`) but exposed two follow-up issues before Docker E2E could run. The workflow edit had installed `shellfirm` in `cargo check default` instead of `construct E2E image`, so GitHub construct still missed the prebuilt cache and fell back to the rejected `cargo install shellfirm --version 0.3.10` source path. `cargo clippy` also failed on the new construct helper because one `mise which` probe needed the same xtask CLI justification used by the existing Docker and git probes, and the mise-shim detector used a redundant closure/collection. The follow-up fix moves `shellfirm` installation to the construct job only, keeps other Rust jobs on the smaller `rust sccache` toolset, and makes the helper clippy-clean. The next proof run must rerun `lanes=both`, verify GitHub construct uses the mise-provided `shellfirm` binary instead of any cargo-install fallback, verify Velnor remains green, and then continue the full marker scan.

Runs `28023491185` (`lanes=both`) and `28023487940` (default PR lane) proved that follow-up fix. The dual-lane run was fully green, including `ci-required`, `timing-summary`, both construct image jobs, and both Docker E2E smoke jobs. GitHub construct installed `shellfirm` through mise, staged the real `0.3.10` binary, copied it into the image, and did not execute any `cargo install shellfirm` fallback; the first GitHub construct run missed the prebuilt `shellfirm` cache and saved it, while Velnor restored the same key. Velnor `sccache` stats now report `Local disk: "/var/cache/sccache"`, proving the runner-side cache mount fix is active. The combined scan of GitHub logs and Velnor job-log artifacts for `28023491185` found zero crates.io index updates, zero crate download markers, zero external-crate `Compiling ... v...` or `Checking ... v...` markers, zero prepared-workspace downloads, and zero shellfirm source installs. The remaining marker noise was rustup/toolchain misses on Velnor and GitHub `rust-cache` target misses on several compile/test jobs; those target misses did not cause dependency downloads or third-party crate recompile markers, but they still belong in the next performance pass because the goal is fastest wall clock, not merely clean dependency markers. The immediately adjacent default GitHub PR run `28023487940` was also green and its log scan found zero dependency/download/compile/cache-miss markers, showing the sequential hosted path is warm after the proof run.

Runs `28010020969` and `28010956214` reran `lanes=both` after merging latest `main` into this branch. Both were fully green, including `ci-required` and `timing-summary`. The first merged-head run proved parity on the new head: GitHub Docker E2E finished in `5m02s`, Velnor Docker E2E in `4m40s`, and the combined GitHub/Velnor log scan found zero crates.io index updates, zero crate downloads, zero external-crate `Compiling ... v...` lines, and zero prepared-workspace artifact restores. The sequential branch run `28010956214` confirmed the same marker result on the next run: GitHub logs and Velnor job-log artifacts again had zero crates.io index updates, crate downloads, external-crate compile markers, failed restores, or prepared-workspace downloads. The remaining GitHub-hosted slowness is therefore not dependency download churn; it is compile-heavy job cost on ephemeral hosted runners with `SCCACHE_GHA_ENABLED=off`. Velnor is faster because the deployed runner now gives job containers a persistent local `SCCACHE_DIR=/var/cache/sccache`; hosted GitHub runners intentionally do not have that local disk. The cache API also showed the new workspace caches scoped to `refs/pull/632/merge`, with no matching `refs/heads/main` or feature-branch cache for the new keys at the start of the proof. This matches GitHub's documented cache scoping: PR merge-ref caches are reusable by that PR, while branch runs can restore current-branch and default-branch caches. The next optimization must therefore be a measured hosted compiler-cache backend or artifact handoff experiment with `sccache --show-stats` and full marker scans; do not reintroduce the rejected GHA `sccache` backend or another serialized target-cache seed job without proving it beats the current green baseline.

Run `28013253814` proved a real GitHub-hosted fan-out race after the merged-head proof. The run was green, but the log scan found four crates.io index updates, four `Downloading crates` headers, and 1286 `Downloaded ...` lines, all on the GitHub lane; Velnor had zero cargo download or compile markers. The root cause was not a missing lockfile or changed dependency graph. Parallel GitHub fan-out jobs started before a shared Cargo registry cache existed for that key, so one job fetched the registry and crates while other jobs were already restoring the same absent key. Velnor hid the bug with persistent runner state. The fix is a per-lane `cargo-registry-warmup` job that runs immediately after routing, restores/populates the shared Cargo registry/index/git DB cache once, and gates every Rust/Cargo job before fan-out. GitHub remains the default required lane; Velnor remains the opt-in parity and speed lane.

Run `28013935155` accepted that fix on the next `lanes=both` proof. `ci-required` and `timing-summary` passed. A combined scan of GitHub job logs plus Velnor `job-log` artifacts found zero crates.io index update markers, zero crate download markers, zero external-crate `Compiling ... v...`, `Checking ... v...`, or `Building ... v...` markers, zero failed restores, and zero prepared-workspace downloads. The remaining cache-miss text was tool/cache housekeeping, mainly GitHub mise cache misses, a first-key `cargo-audit` cache miss, and Velnor rustup cache misses; none caused Cargo dependency fetch or external-crate compilation. The warmup cost was small (`17s` on GitHub, `15s` on Velnor) and paid back immediately: GitHub `clippy` dropped from `2m31s` to `49s`, `check-default` from `2m25s` to `43s`, `bench-build` from `3m02s` to `53s`, construct E2E image from `3m03s` to `1m34s`, and nextest package lanes moved from `36s`-`2m58s` to `41s`-`1m38s`. Velnor stayed faster on most Rust/image jobs (`clippy` `27s` vs GitHub `49s`, `check-default` `25s` vs `43s`, construct E2E image `1m14s` vs `1m34s`, nextest package lanes `25s`-`56s` vs `41s`-`1m38s`), while GitHub was slightly faster on Docker E2E in this run (`3m36s` vs Velnor `3m48s`), so Docker remains a per-run comparison point instead of an assumed Velnor win. Velnor `sccache` stats for `check-default` now show the host-mounted local disk cache at `/var/cache/sccache`, `86` compile requests, `17` executed compiles, `12` hits, `5` misses, and a `70.59%` hit rate, improving from the prior `1.08%` hit rate. The next iteration must keep this report shape for every dual-lane run, including failures, and keep watching whether cache restore time or GitHub's finite cache budget becomes more expensive than the work it avoids.

Run `28016655015` proved the stronger Phase 0 reporting contract after adding lane totals, third-party compile/check/build scanning, and GitHub cache-budget reporting to the shared summary script. The manual `lanes=both` run was green from `2026-06-23T09:33:23Z` to `2026-06-23T09:41:38Z` (`8m15s` wall clock), with `ci-required` and `timing-summary` both passing. Aggregate job runtime was `1927s` across 18 GitHub-lane jobs, `868s` across 18 Velnor jobs, and `32s` across 4 shared jobs, so Velnor was about `2.2x` faster by summed lane job time while GitHub stayed the default required lane. Long poles were Docker E2E on both lanes (`4m40s` GitHub, `4m12s` Velnor). Key Rust/image comparisons stayed in favor of Velnor: `check-default` `2m04s` vs `25s`, `clippy` `2m21s` vs `27s`, `bench-build` `2m52s` vs `27s`, `msrv-check` `2m14s` vs `29s`, construct E2E image `2m26s` vs `2m09s`, and nextest package lanes `47s`-`3m07s` vs `24s`-`59s`. A combined scan of 22 GitHub API logs and 18 Velnor `job-log` artifacts found zero crates.io index updates, zero `Downloading crates` headers, zero `Downloaded ... v...` crate lines, zero external-crate `Compiling ... v...`, `Checking ... v...`, or `Building ... v...` lines, zero source-tool compiles, zero failed restores, and zero prepared-workspace artifact downloads. The remaining cache-miss text is now the next optimization target rather than a dependency-fetch bug: GitHub still reported mise cache misses, one cold Cargo registry warmup key, and `No cache found` lines in target-cache restores; Velnor still reported rustup cache misses and `Rust cache miss for shared key ...` lines even though persistent disk prevented dependency downloads or external-crate rebuilds. Velnor `sccache` for `check-default` still used `/var/cache/sccache` with `86` compile requests, `12` Rust hits, `5` Rust misses, no cache errors, and a `70.59%` hit rate. The GitHub Actions cache usage API reported `107` active caches using `14.54GB` against the `10GB` budget reference, so the next cache iteration must reduce duplicate tool/target cache families or prove the repository has a higher effective quota; otherwise eviction pressure can recreate the parallel fan-out download race that the registry warmup fixed.

Run `28034426570` then exposed a more specific GitHub-hosted target-cache bug after the timing summary learned to see ANSI-colored Cargo output. The run was green and had no crate download markers, but the scanner counted `1902` third-party compile/check/build markers. Direct log inspection showed package and Docker E2E jobs exact-hit `v0-rust-ci-all-features-dev-workspace-v2-Linux-x64-7d019c70-78f08645`, a roughly `269MB` cache that lacked many dependency build outputs; because GitHub Actions caches are immutable, later jobs could restore that partial cache forever but could not repair it. The key split came from nextest-only `CARGO_NET_RETRY`, while `clippy` used `CARGO_NET_OFFLINE=true` and owned a different, broader all-features cache. The fix aligns nextest package and Docker jobs with the clippy Cargo environment hash, removes the ignored per-package/per-Docker `key` suffixes under `shared-key`, and makes CI nextest shards restore-only so the fastest narrow shard cannot seed the shared exact key with an incomplete graph. `schema-check` also gets its own small dev-binary cache, because `cargo run -p jackin-xtask` needs codegen artifacts that the default `cargo check` cache cannot provide, and fuzz now includes the nested `crates/jackin-term/fuzz` target directory in its cache workspace so exact fuzz hits restore the directory Cargo actually uses. The next proof run must verify that `schema-check`, fuzz, nextest package shards, and Docker E2E stop compiling external crates on GitHub; if the first run with aligned keys is cold, the immediate sequential run is the deciding proof.

Run `28024637026` corrected the nextest target-cache assumption. The run was green and the Cargo registry remained offline, but GitHub-hosted package shards other than `jackin` still recompiled external crates: `jackin-capsule` rebuilt crates such as `tracing`, `tokio`, `futures-util`, and `ratatui`; `jackin-runtime` rebuilt crates such as `serde`, `tokio`, `hyper`, `oci-client`, and `openidconnect`; `jackin-tui` rebuilt crates such as `libc`, `syn`, `rustix`, `serde`, and `ratatui`; `small-crates` rebuilt crates such as `serde_json`, `reqwest`, `bollard`, `criterion`, and `sigstore`; and Docker E2E rebuilt third-party crates before building `jackin-capsule`. The comparison job, `cargo nextest jackin (GitHub)`, was the desired shape: it restored a target cache that already had its external dependency artifacts and only rebuilt jackin' workspace crates. The root cause is over-sharing plus single-writer cache ownership: every package shard restored the same `ci-all-features-dev-workspace-v2` target archive, but only the `jackin` shard saved it, and the `jackin` test graph is not a superset of the other package and Docker E2E graphs. The fix keeps one shared rust-cache namespace for fallback behavior, but adds shard-specific target-cache suffixes (`package-<group>` and `docker-e2e`) and lets only the GitHub-hosted lane save those archives. The verification rule is now explicit: on a sequential run with the same source, lockfiles, toolchain, features, and env fingerprint, GitHub and Velnor logs must show zero crates.io index updates, zero crate downloads, and zero external-crate `Compiling ... v...`, `Checking ... v...`, or `Building ... v...` markers; jackin' workspace crate rebuilds are acceptable when source fingerprints changed or Cargo needs local test binaries. Any exception must name the changed fingerprint, cache eviction, cache budget pressure, or another measured cause before the run can be considered optimized.

Run `28030526408` closed the archive-attestation parity gap. Earlier archive jobs skipped `actions/attest-build-provenance` on Velnor because the Velnor Node action container path dropped hyphenated `INPUT_*` names such as `INPUT_PUSH-TO-REGISTRY`, so `@actions/core` parsed missing boolean inputs and failed before provenance generation. Velnor commit `08672f1` fixes the runner by bypassing the Node image entrypoint/shell and invoking Node as the container entrypoint, preserving hyphenated action input environment variables. jackin' preview, release, and `jackin-dev` archive jobs no longer gate `attest` to the GitHub lane; both lanes run the same signing, SBOM, and provenance attestation contract. The `jackin-dev` `lanes=both` proof passed in `2m21s`: GitHub build legs took `41s`-`1m21s`, Velnor build legs took `44s`-`56s`, and all four Velnor `job-log` artifacts recorded `Attestation created` plus `Attestation uploaded to repository`. The combined normal GitHub log and Velnor job-log scan found zero crates.io index updates, zero crate download markers, zero external-crate `Compiling ... v...`, `Checking ... v...`, or `Building ... v...` markers, zero prepared-workspace downloads, zero `cargo install` source-tool compiles, and zero old YAML boolean-input failures. The only cache-miss markers were Velnor rustup toolchain cache misses, which did not cause Cargo dependency downloads or external-crate rebuilds.

Runs `28036125567`, `28038399963`, and `28039373389` rechecked the GitHub-hosted cache path after deleting stale PR merge-ref caches. `28036125567` attempt 2 was the expected cold PR-cache population run: it was green, but reported 26 cache misses, 644 dependency-download markers, and 4152 third-party compile/check/build markers while repopulating the branch cache. Its warm attempt 3 proved the generic cache shape still had one gap: 0 cache misses, 0 dependency downloads, and 0 source-tool compiles, but 1912 third-party compile markers remained in `Docker E2E smoke`. Adding a Docker-owned cache removed that Docker compile gap on the next warm rerun (`Docker E2E smoke` fell from 5m10s cold to 3m02s warm), but `28038399963` still showed 1453 third-party markers in package shards such as `small-crates`. The root cause was the same over-sharing bug already captured by run `28024637026`: fan-out needs exact package-shard target caches, not one generic all-features target cache. After adding `ci-all-features-dev-workspace-v2-package-<group>` for package shards and `ci-docker-e2e-dev-workspace-v1` for Docker E2E, warm rerun `28039373389` was green with 134 hit/restore markers, 0 misses, 0 failures, 0 dependency downloads, 0 third-party compile/check/build markers, 0 source-tool compiles, and 0 prepared-workspace restores; `ci-required` completed in about 4m13s and the workflow wall clock was about 4m42s. Final warm run `28040274004` kept the exact-cache shape green after the follow-up docs and cache-prune commits: the timing summary reported 134 hit/restore markers, 0 misses, 0 failures, 0 dependency downloads, 0 third-party compile/check/build markers, 0 source-tool compiles, and 0 prepared-workspace restores; `ci-required` completed in about 3m53s and the workflow wall clock was about 4m21s. The cleanup pass then pruned obsolete feature-branch, superseded PR, and non-target setup/tool caches while leaving Rust target and registry caches intact; active cache usage dropped to about 9.97GB across 30 caches, under the 10GB budget reference. GitHub remains required/default, while Velnor remains optional parity and speed proof.

Run `28043219851` proved the current `lanes=both` CI head after pinning composite mise installs: `ci-required` and `timing-summary` passed, GitHub remained the default required lane, and Velnor stayed the optional speed lane across the Rust and Docker surfaces. The paired standalone `Construct Image` proof run `28043219591` exposed a feature-branch dispatch-only publish rehearsal problem rather than a build-cache problem: build legs completed, but both publish rehearsal jobs ran the immutable construct-version guard and failed because `projectjackin/construct:0.16-trixie` already exists. Manual `workflow_dispatch` on non-`main` now keeps construct build/rehearsal coverage while setting `construct_image=false`, so branch parity proofs cannot fail solely because the published construct tag already exists; `main` dispatches still preserve the publish/version guard.

Run `28046072118` proved the Docker E2E smoke path after the Velnor lane exposed a registry dependency hidden inside the fake-agent test script. The earlier Velnor failure in run `28045223564` was not a Cargo-cache regression: the fake Claude runtime started a child DinD daemon and ran `docker run alpine:3.20`, which hit Docker Hub's unauthenticated pull-rate limit before the test emitted its success marker. The smoke now imports a local scratch-style image as `jackin-dind-e2e-smoke:local`, runs both the direct Docker CLI probe and the Testcontainers Java probe from that local image, and sets the Testcontainers pull policy to never pull during the smoke. The `lanes=both` proof on head `970feef0` was green: `Docker E2E smoke (GitHub)` passed in `3m38s`, `Docker E2E smoke (Velnor)` passed in `3m45s`, `ci-required` passed, and `timing-summary` passed. The combined scan across `40` jobs found zero dependency downloads, zero third-party compile/check/build markers, zero source-tool compiles, zero prepared-workspace restores, zero cache misses or failures, zero `sccache` issues, and zero Docker Hub/rate-limit markers. Velnor remained faster than GitHub on most Rust compile/test gates, while Docker E2E was slightly slower on Velnor for this run; GitHub remains the default required lane and Velnor remains the optional parity/speed lane.

Run `28047785428` rechecked the same head after the first cache-populating run `28047114783`. The manual `lanes=both` CI run was green: `ci-required` passed, `timing-summary` passed, GitHub stayed the required/default lane, and Velnor stayed opt-in. The ANSI-stripped scan across GitHub logs plus the Velnor `job-log` artifact found zero crates.io index updates, zero crate downloads, zero external-crate compile/check/build markers, zero source-tool compiles, zero prepared-workspace restores, zero Docker Hub/rate-limit markers, and zero `sccache` errors. It also found two remaining warm-runner hygiene markers in `Docker E2E smoke (Velnor)`: the Rust toolchain cache reported no hosted cache entry, and `Swatinem/rust-cache` reported a miss for `ci-docker-e2e-dev-workspace-v1`. Those misses did not cause dependency downloads or external crate rebuilds because the Velnor host-persistent Cargo registry was warm, but they remain follow-up evidence for improving Velnor Docker E2E target/toolchain reuse rather than claiming the optional lane is perfect. Timings stayed in the expected band: GitHub Rust/package jobs were green, Velnor was faster on most Rust compile/test lanes, and Docker E2E remained real-test-runtime dominated (`2m52s` on GitHub, `3m04s` on Velnor).

Runs `28051139830` and `28051846962` closed that Velnor rust-cache false-miss loop. Run `28051139830` was green and had no dependency-download or external-crate compile markers, but its scan still showed Velnor `Rust cache miss` markers for persistent `CARGO_TARGET_DIR` jobs because the native Velnor `Swatinem/rust-cache` adapter could not see the container-injected `CARGO_TARGET_DIR=/__cargo_target`. Velnor commit `981b3fe` now exposes the effective container runtime env to native actions; it was deployed to Sentry as `velnor-runner 0.1.29+trustscope.20260624.193000.981b3fe` and verified with all jackin daemons active. The follow-up `lanes=both` proof run `28051846962` was green: `ci-required` and `timing-summary` passed, the combined scan found zero `Rust cache miss` lines, zero crates.io index updates, zero crate download markers, zero external-crate compile/check/build markers, zero source-tool compiles, zero prepared-workspace restores, zero Docker Hub/rate-limit markers, and no nonzero `sccache` errors. Velnor was faster on almost every Rust compile/test lane (`check-default` `23s` vs GitHub `1m10s`, `clippy` `25s` vs `1m09s`, construct E2E image `50s` vs `1m18s`, package shards `26s`-`56s` vs `33s`-`1m18s`), while Docker E2E was slower on Velnor in this run (`3m29s` vs GitHub `2m55s`) without dependency/cache markers. Treat Docker E2E as real Docker/test-runtime dominated and keep reporting it per run; do not claim Velnor wins that lane unless the measured run says so.

Run `28054308931` then proved the schema-check cache split after one cache-populating run. The sequential GitHub rerun was green, restored `ci-schema-check-dev-workspace-v2` exactly, compiled only the local `jackin-xtask` crate in `1.58s`, and the full log scan found zero dependency downloads, zero external-crate compile/check/build markers, zero cache restore misses/failures, zero Docker Hub/rate-limit markers, and zero nonzero `sccache` errors. That run also showed why `sccache` reporting must distinguish correctness errors from low utility: hosted GitHub jobs still reported workspace-cache misses and `0.00%` hit-rate lines even though no external dependencies rebuilt, so the summary script now reports nonzero `sccache` errors separately from low-utility markers. Run `28055477893` attempt 2 proved that reporting shape on the current head: `ci-required` and `timing-summary` passed, `cargo fuzz` restored `ci-fuzz-workspace-v3` exactly after the first attempt populated it, and the full scan found zero dependency downloads, zero cache-miss markers, zero external-crate compile/check/build markers, zero source-tool compiles, zero prepared-workspace restores, zero Docker Hub/rate-limit markers, and zero nonzero `sccache` errors. The only remaining compiler-cache signal was `28` low-utility `sccache` markers, all from hosted-runner workspace compile stats rather than dependency rebuilds.

Run `28056690350` proved the current head across both lanes immediately after cleaning the old GitHub Actions cache set. The dual-lane dispatch was green, with `ci-required` and `timing-summary` passing; Velnor stayed faster across the Rust compile/test lanes (`check-default` `25s` vs GitHub `2m10s`, `clippy` `34s` vs GitHub `2m23s`, construct E2E image `1m21s` vs GitHub `2m14s`, and package shards `27s`-`59s` vs GitHub `53s`-`3m03s`), while Docker E2E remained real Docker/test-runtime dominated (`3m46s` on Velnor vs `5m15s` on GitHub). Because the GitHub cache set had just been intentionally cleared, the scan correctly found first-run GitHub cache misses and Cargo registry downloads; it still found zero external-crate compile/check/build markers, zero source-tool compiles, zero prepared-workspace restores, zero Docker Hub/rate-limit markers, and zero nonzero `sccache` errors. The follow-up warm GitHub dispatch `28057443103` then proved the desired sequential-run behavior on the same head: all CI jobs passed, the log scan found zero dependency downloads, zero cache misses, zero external-crate compile/check/build markers, zero source-tool compiles, zero prepared-workspace restores, zero Docker Hub/rate-limit markers, and zero nonzero `sccache` errors. The warm GitHub timings also showed the expected speedup from the rebuilt shared caches: `check-default` `48s`, `clippy` `1m03s`, construct E2E image `1m37s`, package shards `27s`-`1m12s`, Docker E2E `3m04s`, and only `28` low-utility hosted-runner `sccache` stat markers that did not correspond to dependency rebuilds. After the latest trailer-only force-push, current-head run `28071592012` attempt 3 proved the same warm hosted path on commit `9ffb47c`: all CI jobs passed, `ci-required` completed in about `3m17s`, and the log scan found zero dependency downloads, zero cache misses, zero external-crate compile/check/build markers, zero source-tool compiles, zero prepared-workspace restores, zero Docker Hub/rate-limit markers, and zero nonzero `sccache` errors. The only remaining signal was `28` low-utility hosted-runner `sccache` stat markers, again not tied to dependency rebuilds.

## Problem [#problem]

PR feedback, main-branch confidence, preview publishing, and tagged releases all need faster wall-clock results without dropping coverage. jackin' already has many good CI primitives -- path filters, PR cancellation, aggregator jobs, `Swatinem/rust-cache`, `cargo-nextest`, Docker Buildx layer caches, Bun download caching, pinned mise tools, and split workflows -- but the run triggered by commit [`4c8b94bd05f84a62a04f9f235e2d846f14d04366`](https://github.com/jackin-project/jackin/commit/4c8b94bd05f84a62a04f9f235e2d846f14d04366) shows that broad Rust/runtime changes still produce a long post-merge feedback loop: roughly 12.5 minutes to green `CI`, then roughly 8.5 more minutes before preview artifacts publish.

The target state is not "run less and hope". The target state is staged signal: cheap deterministic checks fail first, expensive checks start as early as their real prerequisites allow, full coverage still runs before merge or before publishing, and every cache has measured hit/miss behavior rather than folklore.

"Almost instant" has two distinct ceilings, and they need different work. A cold GitHub-hosted runner always pays a floor of toolchain install, cache restore, and compilation of whatever the change touched, so the realistic best case for a real Rust change on hosted runners is a few minutes, not seconds. The cases that can become near-instant on hosted runners are the ones where change-aware routing skips the Rust surface entirely: docs-only, workflow-only, or construct-only changes should finish in well under a minute. Rust changes only approach instant on a persistent runner whose `target/`, cargo registry, and Docker layers stay warm between runs, because nothing the GitHub Actions cache can do is as fast as the build directory already being on disk. The program therefore runs on two tracks: tighten change-aware routing so non-Rust changes are near-instant on hosted runners, and stand up a warm persistent lane so incremental Rust changes recompile only the edited crate.

This roadmap is an iterative optimization program, not a one-shot workflow cleanup. Each implementation PR should pick one bottleneck or change class, capture the baseline, explain why the time is being spent, research candidate speedups, apply the smallest safe change, rerun the same scenario, compare the numbers, and either keep the change with evidence or adjust/revert it. Repeat until the remaining time is mostly irreducible work: the tests, builds, signing, publishing, and verification that are genuinely required for the files that changed.

The end state should be change-aware CI/CD. Docs-only changes should not build Rust binaries. Workflow-only changes should not run Docker E2E unless they affect that workflow. Construct-image changes should rebuild and verify construct paths without forcing unrelated docs deploy work. Rust changes should run the affected Rust test/build surface, plus the cross-cutting gates that can actually be invalidated by those changes. When the dependency graph is uncertain, run the broader safe set and use the measurement from that run to improve the classifier later.

## Iteration Loop [#iteration-loop]

Every speedup PR under this roadmap should follow this loop:

1. **Measure** every workflow/lane from every run before calling the optimization done. Record job duration, long steps, queue time if visible, cache hits/misses, cache keys, `sccache` hit rates, and the exact changed-file class that caused the work.
2. **Explain** the number: compilation, dependency download, tool install, Docker layer build, image push/pull, test execution, artifact upload, signing, or publish gate.
3. **Research** available speedups for that specific cost center: cache backend, cache key, test partitioning, nextest archive reuse, `sccache`, BuildKit cache mounts, prebuilt tool install, runner choice, or path-filter precision.
4. **Apply** one focused change or one tightly related group of changes.
5. **Rerun** equivalent scenarios: at minimum one PR-style run and one main/preview-style run when the change affects both.
6. **Compare** before/after wall time, first-failure time, required-check completion time, cache hit rate, and coverage surface.
7. **Inspect logs aggressively** for dependency download and rebuild markers: `No cache found`, `Updating crates.io index`, `Downloading`, large third-party `Compiling` blocks, BuildKit cache misses, source-compiled tools, and low `sccache` hit rates. If any marker appears after the first cache-populating run for the same dependency/tool inputs, explain why the cache was dirty or patch the workflow so the next sequential run reuses the cache.
8. **Record** the result in the PR and, if the decision changes the roadmap, update this item with the measured outcome.
9. **Repeat** until further speedups would require dropping coverage, weakening publish safety, or spending disproportionate engineering effort for negligible wall-clock gain. A run is not accepted as optimized while an avoidable dependency download, source tool compile, third-party dependency compile, or cache-key fragmentation remains.

## Change-impact Routing Goal [#change-impact-routing-goal]

The pipeline should eventually derive a compact run plan from changed paths and workflow inputs:

| Change class                          | Required work                                                                                                                                     |
| ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| Docs/prose only                       | repo links, docs build/link check when published docs changed, Codebook docs/prose checks, deploy/live-link checks only on main docs deploy paths |
| GitHub workflow/tooling only          | `actionlint`/`shellcheck`, affected workflow dry-run or targeted job, plus docs checks if docs changed                                            |
| Rust crate-local change               | fmt, schema when config/schema inputs changed, cargo check/clippy, affected package tests, dependency/audit policy when lock/tooling changed      |
| Runtime/launch/capsule handoff change | Rust gates plus nextest Docker E2E lane because `dind_e2e` covers real Docker/runtime/capsule behavior                                            |
| Construct image change                | construct image build/publish path plus Docker E2E using the rebuilt construct artifact                                                           |
| Preview/release build logic change    | preview/release archive build rehearsal, signing/SBOM/attestation checks, publish mutation only after CI gates                                    |
| Unknown or cross-cutting change       | broader safe set, then refine classifiers once the run shows which work was actually needed                                                       |

## Evidence From `4c8b94bd` [#evidence-from-4c8b94bd]

The commit merged a large instant-launch change set: workflow rewrites, Rust runtime/image/launch code, docs, and `docker/construct` inputs. That breadth intentionally fired every mainline path filter. All runs below were successful.

| Workflow                   | Run                                                                                | Trigger                   | Wall time | Long pole                                                                                                                                                         |
| -------------------------- | ---------------------------------------------------------------------------------- | ------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `CI`                       | [`27937532691`](https://github.com/jackin-project/jackin/actions/runs/27937532691) | `push` to `main`          | 12m 34s   | `cargo nextest prepare` 4m 33s, `cargo build validator` 6m 00s, then Docker E2E waited for the full package matrix and ran 3m 07s                                 |
| `Docs`                     | [`27937532567`](https://github.com/jackin-project/jackin/actions/runs/27937532567) | `push` to `main`          | 5m 06s    | cold `codebook-lsp` installs in `spell-check-docs` and `spell-check-source` at about 2.5m each, docs link/build path 2m 26s, deploy live-link verification 1m 21s |
| `Construct Image`          | [`27937532647`](https://github.com/jackin-project/jackin/actions/runs/27937532647) | `push` to `main`          | 4m 12s    | arm64 image publish 2m 59s, amd64 image publish 1m 53s, manifest publish 39s                                                                                      |
| `Publish Homebrew Preview` | [`27938150807`](https://github.com/jackin-project/jackin/actions/runs/27938150807) | `workflow_run` after `CI` | 8m 31s    | four release-profile `cargo zigbuild` jobs at 6m 47s to 7m 26s, then publish 47s                                                                                  |
| `Renovate`                 | [`27937532547`](https://github.com/jackin-project/jackin/actions/runs/27937532547) | `push` to `main`          | 3m 40s    | self-hosted Renovate 3m 30s; not part of branch protection but consumes Actions capacity after every main push                                                    |
| `Renovate Validate`        | [`27937532557`](https://github.com/jackin-project/jackin/actions/runs/27937532557) | `push` to `main`          | 11s       | no meaningful speed issue                                                                                                                                         |

Within `CI`, the fastest checks already return early: `changes` 5s, `actionlint` 13s, `fmt` 24s, `schema-check` 33s. The slow path is structural. The old `docker-e2e` path depended on the reusable `test` workflow as a whole, so it started only after every package test job completed, even though it could run in parallel with most package tests once the required construct image was available. This branch moved Docker E2E into the reusable nextest workflow and then removed the serialized nextest `prepare` gate after measurement showed cold cache misses made it the new critical path.

The inherited CI matrix split item also called out the real Docker boundary directly: <RepoFile path="crates/jackin/tests/dind_e2e.rs">crates/jackin/tests/dind\_e2e.rs</RepoFile> is now 1323 LOC and exercises real `docker run`, PTY, runtime launch, and `jackin-capsule` handoff behavior. That should stay a named `docker-e2e` failure surface instead of being buried inside a general package-test lane. It should still belong to the nextest test system, though: the current command is already `cargo nextest run -p jackin --features e2e --profile docker-e2e`, so the better architecture is a dedicated nextest Docker lane inside the reusable nextest workflow, not an unrelated top-level CI job.

## Current Speedups Already Present [#current-speedups-already-present]

* Path filters in <RepoFile path=".github/workflows/ci.yml">.github/workflows/ci.yml</RepoFile>, <RepoFile path=".github/workflows/docs.yml">.github/workflows/docs.yml</RepoFile>, <RepoFile path=".github/workflows/construct.yml">.github/workflows/construct.yml</RepoFile>, and <RepoFile path=".github/workflows/preview.yml">.github/workflows/preview\.yml</RepoFile> prevent unrelated workflows from doing full work.
* PR workflow concurrency cancels stale PR runs while preserving non-cancelled release serialization.
* Rust jobs restore `~/.rustup`, mise-managed tool caches, and `Swatinem/rust-cache` target/cargo caches; the reusable nextest workflow centralizes the expensive test-binary build and shares it with package jobs.
* Construct builds use Buildx with registry cache for published main builds and GitHub Actions cache scopes for PR/rehearsal builds.
* Construct builds stage the pinned `shellfirm` binary before Docker Buildx runs, so the construct Dockerfile no longer carries a Rust toolchain or from-source `shellfirm` compile stage.
* Docs jobs cache Bun's download cache and lychee's link cache, and they separate repo-link checks from full site build/link checks.
* `CARGO_INCREMENTAL=0` is already set on the main compile-heavy CI/preview/release paths, which is compatible with compiler-output caching via `sccache`.

## Findings [#findings]

### Two co-critical lanes set the floor, not one [#two-co-critical-lanes-set-the-floor-not-one]

The analyzed `CI` run was gated by two independent lanes that each took roughly twelve minutes, so speeding up only one of them would have left the workflow's wall clock unchanged. The old lint lane ran `check-all-features` and then `clippy` and `check-default`, which were serialized because both of the latter declared `needs: check-all-features`. The old test lane ran nextest `prepare`, then the per-package matrix, then `docker-e2e`, which waited for the entire reusable `test` workflow.

```
changes
 ├─ check-all-features ──┬─ clippy              (old lint lane, ~12m)
 │                       └─ check-default
 └─ test: prepare ─ packages(19) ─ docker-e2e    (old test lane, ~12m)
```

This branch cuts both lanes: the lint lane no longer runs the redundant `check-all-features` job, Docker E2E starts after `prepare`, and the package matrix is bucketed into named heavy crates plus one small-crate bucket. The measured GitHub-hosted PR run `27961783994` completed `ci-required` in about 10m16s, down from about 12m33s on the analyzed `main` run `27937532691`. A later artifact-handoff experiment barely improved full workflow wall time and made individual fan-out jobs worse: run `27975505075` still took about 13m53s overall and spent up to 4m09s downloading the prepared nextest workspace. Removing that handoff and relying on shared cache restore produced run `27976780904`, which completed `ci-required` in about 7m51s. A cold-ish follow-up on the rewritten branch, run `27981536710`, exposed the remaining cache truth: the new Cargo registry key was cold, many parallel jobs fetched the same crates and then raced to save the same cache, and `cargo nextest prepare` recorded `sccache` 0% hits with 934 misses. The immediate sequential run `27982280463` proved the registry/target caches do help once warm -- `cargo check default` fell from 2m44s to 42s, `clippy` from 2m57s to 57s, MSRV from 2m28s to 42s, and nextest prepare from 5m04s to 2m16s -- but the hot log still showed `sccache` 0% hits in `check-default` and nextest prepare because the compiler-cache namespaces were split by job. Runs `27982918582` and `27983590600` proved that simply sharing the namespace is still not enough: the same hosted-runner commit stayed in the same band or got worse (`nextest prepare` 2m16s then 2m36s), `sccache` remained 0% hits, and the logs showed cache write errors on every cacheable Rust compile. Run `27984408161` proved the first run-unique write-key fix was still too broad: `ci-required` stayed green, dependency downloads were still gone, but nextest prepare worsened to 2m52s and `check-default`/nextest prepare still showed 0% hits plus write errors because multiple jobs wrote the same run key. Run `27985175824` proved per-job write keys still did not make hosted-runner GHA `sccache` useful: dependency downloads stayed gone, but `sccache` remained at 0% hits with write errors in `check-default` and nextest prepare. The useful win in that sequence was Docker layer reuse, where construct E2E image build fell from 3m59s to 1m34s on the hot shared-cache run and stayed near 2m00s after the key change. Runs `27986624424` and `27987160427` then proved that removing the wrapper entirely was slower even with cache hits; run `27987545345` restored the wrapper with `SCCACHE_GHA_ENABLED=off` and returned `check-default` to 57s and nextest prepare to 2m19s. The branch now rejects the hosted-runner GHA compiler-cache backend but keeps wrapper-compatible Cargo target fingerprints on GitHub-hosted jobs; Velnor keeps local-disk `sccache` because that backend can stack with warm `target/`. The lint lane itself is no longer the long pole: in the old run, `cargo check all features` started at 07:47:20Z and the dependent `clippy`/`check-default` pair finished at 07:51:16Z, roughly 3m56s later; in run `27961783994`, `clippy` and `check-default` both started at 14:53:44Z and the slower one finished at 14:54:35Z, roughly 51s later. The remaining measured critical path is now construct E2E image build, nextest prepare, and Docker E2E.

### Lint and check jobs serialize behind a cache they do not share [#lint-and-check-jobs-serialize-behind-a-cache-they-do-not-share]

In the old <RepoFile path=".github/workflows/ci.yml">.github/workflows/ci.yml</RepoFile> shape, `clippy` and `check-default` both declared `needs: check-all-features`, but every one of the three jobs set its `Swatinem/rust-cache` `shared-key` to its own `github.job`, so they never shared a `target/` directory. The dependency bought nothing on the green path: `clippy` waited for `check-all-features` to finish and then recompiled the workspace from its own cold cache, adding a full compile to the lint lane for no reuse. Because `cargo clippy --workspace --all-targets --all-features` performs the full type and borrow check before linting, `clippy` is a strict superset of `check-all-features` -- anything that compiles under clippy compiles under check, and check cannot catch a compile error clippy would miss. This branch removes the redundant `check-all-features` job entirely, leaving `clippy` as the all-features compile gate and `check-default` as the default-feature compile gate.

### Critical-path sequencing beats more matrix fan-out [#critical-path-sequencing-beats-more-matrix-fan-out]

The prior "split the monolithic check job" work is partly implemented now: fmt, schema, check, clippy, dependency policy, audit, nextest package tests, Docker E2E, fuzz, bench build, and MSRV are separate jobs. The remaining opportunity is dependency shape. `docker-e2e` used to wait for all package tests because the caller saw the reusable `test` workflow as one dependency. This branch moved Docker E2E into <RepoFile path=".github/workflows/rust-nextest.yml">.github/workflows/rust-nextest.yml</RepoFile> as a first-class nextest lane, and then removed the `prepare` dependency when run `27999009032` proved the serial cache-seeding job was slower than direct parallel shard restore on cache misses.

The remaining matrix work should preserve the original attribution goal as well as speed: lint/check, deterministic package tests, integration-heavy runtime tests, Docker E2E, capsule, MSRV, validator, dependency policy, and audit should stay distinguishable in GitHub checks even if the implementation moves to nextest archives or partitions. A local archive experiment on this branch built a `jackin-capsule` nextest archive successfully, but running from that archive failed an existing test that expects checkout-local context. A same-run prepared-target artifact experiment was also rejected after measurement: artifact download and extraction were slower than restoring the semantic cache and letting the small amount of remaining local work run. Archive/partition work should resume only after checkout-local tests are archive-safe and after an equivalent run proves lower wall clock than the cache-only matrix.

There was a second, larger opportunity in the same lane: the 19-package matrix was mostly fixed overhead. The branch first used `prepare` to build every test binary with `cargo nextest run --workspace --no-run --all-features`, but later measurement showed the same job becomes a wall-clock regression when its target cache misses. This branch adopts the middle ground from Phase 2: named jobs for the heavy crates plus one bucket job for the small crates, all restoring the same shared cache directly. Docker E2E rebuilds `jackin-capsule` from the warm restored cache before setting `JACKIN_CAPSULE_BIN`; this cost about 6s in run `27976780904`, far cheaper than the rejected multi-minute prepared-workspace handoff.

### Main-to-preview is serialized too late [#main-to-preview-is-serialized-too-late]

Preview builds used to wait until the whole `CI` workflow completed because <RepoFile path=".github/workflows/preview.yml">.github/workflows/preview\.yml</RepoFile> triggered from `workflow_run`. On the analyzed commit, that meant about 12.5 minutes of idle time before a 7.5 minute build matrix began. This branch changes that shape to "build early, publish after CI": preview artifact builds start on the `push` event in parallel with CI, while the release mutation and Homebrew tap update wait for the matching CI conclusion and source SHA.

The stronger structural option is to remove the cross-workflow handoff entirely by building the preview archives inside the `CI` DAG and gating a final publish job on the full needs set plus push-to-main. That eliminates the inter-workflow event latency, shares one checkout and one warm cache, and produces a single required check. This branch removes the duplicate `CI` `build-validator` job, so preview now owns the release-profile `jackin` archive build that packages `jackin-role`; the remaining consolidation question is whether that preview build should move into the `CI` DAG.

### Release and preview should share one build implementation [#release-and-preview-should-share-one-build-implementation]

Preview builds all targets from Linux with `cargo-zigbuild` plus a cached macOS SDK. The release workflow used to keep macOS runners for macOS targets and a separate install sequence, which made cache behavior harder to reason about and made preview a weaker rehearsal. This branch moves release archive builds onto the same Linux `cargo-zigbuild` shape and shared target-scoped archive cache keys; the remaining durable cleanup is one composite/reusable "build signed archive" path used by preview and release, with release adding only tag/version and publishing gates.

The concrete direction was to adopt preview's `cargo-zigbuild`-from-Linux path for every target, including macOS, and to drop the `macos-latest` runners from the release critical path; preview already proves the cross-compile works with a cached SDK, and macOS runners carry roughly a tenfold cost multiplier and slower start-up. This branch implements that path and leaves a single native-macOS build-and-test in the scheduled hygiene lane for parity, so a macOS-specific regression is still caught off the critical path. The preview and release archive jobs also now share `Swatinem/rust-cache` keys by archive target instead of splitting warm caches by workflow job name (Phase 7).

### Tool installation cache misses matter on cold revisions [#tool-installation-cache-misses-matter-on-cold-revisions]

The inspected logs show warm `rustup` caches, but cold <RepoFile path="mise.toml">mise.toml</RepoFile> changes caused cargo-installed tools to build from source: `cargo-audit`, `cargo-deny`, `cargo-shear`, and `codebook-lsp` each paid cold-install cost in at least one job. The `mise-action` internal cache missed too because workflows pin the action SHA but did not pin the mise binary version input; some manual caches only covered `~/.local/share/mise/installs/cargo-*`, so non-cargo tools such as `zig`, `cosign`, and `syft` relied on the action's internal cache. This was correct functionally, but noisy for a speed-critical pipeline.

Two concrete reductions follow. First, every `cargo:` entry in <RepoFile path="mise.toml">mise.toml</RepoFile> is compiled from source by mise's cargo backend; the CI tool entries now use mise's GitHub release backend where upstream publishes reliable binaries: `cargo-nextest`, `cargo-deny`, `cargo-audit`, `cargo-hack`, `cargo-zigbuild`, `cargo-shear`, `cargo-fuzz`, and `codebook-lsp`. Second, the deeper dedup is a baked CI image: roughly fifteen `CI` jobs each repeat the mise install and rustup restore, so building a `jackin-ci` image (Debian plus the pinned toolchain and tools from the same <RepoFile path="mise.toml">mise.toml</RepoFile>, reusing the construct-image machinery) and running Rust jobs in `container:` removes per-job tool setup wholesale (Phase 1).

### Docker layer caching is good, and cache-mount risk is lower now [#docker-layer-caching-is-good-and-cache-mount-risk-is-lower-now]

Docker Buildx registry and GHA caches are already used. Docker's docs note that `cache-to mode=max` exports more layers than `mode=min`, which matches the current construct cache choice. Docker's GitHub Actions cache docs also note that BuildKit cache mounts are not preserved in the GHA cache by default. That mattered when <RepoFile path="docker/construct/Dockerfile">docker/construct/Dockerfile</RepoFile> compiled `shellfirm` from source with cargo registry/git and `/sccache-build` cache mounts; this branch removes that stage by staging `shellfirm` before Buildx, so the remaining Docker speed work is regular layer-cache behavior, not cache-mount preservation for a Rust compile.

### `sccache` should be adopted, with stats proving each lane [#sccache-should-be-adopted-with-stats-proving-each-lane]

Mozilla `sccache` caches compiler outputs through `RUSTC_WRAPPER` and supports local plus remote/GHA-backed storage. Its Rust guidance requires incremental compilation to be disabled for cacheability, which already matches the project's compile-heavy jobs. This branch measured hosted-runner GHA `sccache` and rejected it after runs `27984408161` and `27985175824` showed 0% hits plus write errors in `check-default` and nextest prepare. Follow-up runs `27986624424` and `27987160427` showed the opposite trap: removing `RUSTC_WRAPPER=sccache` from GitHub-hosted compile jobs removed backend errors but made Cargo target reuse slower even when every registry/tool/target cache restored successfully. GitHub-hosted jobs therefore keep the wrapper for target-cache/fingerprint compatibility while forcing `SCCACHE_GHA_ENABLED=off`; Velnor keeps local `sccache` because persistent local disk is the backend shape that can actually win.

The GitHub Actions cache backend for `sccache` is not part of the hosted baseline anymore: narrow pilots showed 0% hits and write errors, and the broader fan-out risk remains the same throttling class as Docker GHA cache. Prefer a backend that does not rate-limit -- local disk on a persistent runner, or S3/Redis -- which is also why `sccache` pairs best with the warm-runner lane, where an on-disk `target/` already outperforms a remote compiler cache.

### Incremental compilation is a targeted experiment, not a universal switch [#incremental-compilation-is-a-targeted-experiment-not-a-universal-switch]

Cargo's profile docs say incremental compilation stores reusable state in `target`, only applies to workspace/path dependencies, and can be overridden with `CARGO_INCREMENTAL`; dev/test defaults enable it, release defaults disable it. In this repo the CI jobs force it off to keep caches deterministic and sccache-compatible. Re-enabling it may help same-branch PR reruns if target caches are retained, but it increases cache size and conflicts with sccache. Treat it as an experiment for a narrow nextest lane, not for final release artifacts.

### Persistent warm runners are the only instant path for Rust changes [#persistent-warm-runners-are-the-only-instant-path-for-rust-changes]

A cold GitHub-hosted runner starts with an empty `target/` and recompiles the changed crate's dependency closure every run; the GitHub Actions cache restores a snapshot but still pays the download and extract of a multi-gigabyte archive plus any post-restore recompilation. A persistent runner that keeps `target/`, `~/.cargo`, and Docker layers warm between runs compiles only the crate that actually changed -- the difference between minutes and seconds, and a gap no hosted-runner cache strategy can close. The existing `velnor` lane (selectable through the `lanes` `workflow_dispatch` input) is the seed of this, but it must remain an explicit opt-in accelerator. GitHub-hosted runners stay the default and required path for jackin' because they provide the stable trust boundary and cold-run parity; Phase 8 is therefore about making Velnor complete enough for optional `lanes: both` verification, not about making it the default.

### Dual-runner parity is a hard constraint; velnor speedups ride on top [#dual-runner-parity-is-a-hard-constraint-velnor-speedups-ride-on-top]

Every lane must be runnable on both GitHub-hosted runners and the self-hosted `velnor` lane ([tailrocks/velnor](https://github.com/tailrocks/velnor)) when a maintainer explicitly selects `lanes: both`, and a GitHub-hosted run must stay the default and required parity gate -- a green warm-lane run has to imply a green cold-lane run, the same PR/main parity rule the repo already enforces. Velnor may carry heavy optimizations that hosted runners cannot (a warm `target/`, warm `~/.cargo`, warm Docker layers, a local-disk `sccache` backend), but only as an opt-in accelerator on top of a baseline that still passes on hosted runners. The rule for every speedup is therefore: when a capability is missing, improve it in `velnor` itself so the capability exists on that runner, then verify the change still runs on both lanes before it lands. Never fork the pipeline into velnor-only behavior that hosted runners cannot reproduce, and never drop a job to hosted-only just to avoid teaching velnor the capability -- both leave the two lanes out of parity.

The scaffolding already exists: `matrix-setup` emits a `configs` array and most compile and test jobs fan out with `runs-on: ${{ fromJSON(matrix.config.runner) }}`, so `lanes: both` already runs them on both. `construct-e2e-image`, preview archive builds, and release test/archive builds now use that same runner-lane matrix. Construct image artifacts are lane-scoped so Docker E2E consumes the image built by the same lane; preview and release archive artifacts are also lane-scoped so manual `lanes: both` rehearsals can require both lanes while mutation jobs download only the GitHub-hosted artifacts. The current gaps to close before Velnor is useful as an optional parity lane are concrete: `docker-e2e` and the construct build assume a working Docker daemon, so Velnor must provide one; the construct workflow builds arm64 natively on `ubuntu-24.04-arm`, so Velnor needs an arm path or that leg stays hosted-only; and the persistent state on Velnor -- the very warm directories that make it fast -- must be protected from cross-run poisoning. Each gap is a fix to make Velnor compatible, not a reason to make it default.

### Non-critical workflows compete for the runner pool on every push [#non-critical-workflows-compete-for-the-runner-pool-on-every-push]

Every push to `main` fires `CI`, `Construct Image`, `Docs`, and `Renovate` concurrently, then `Publish Homebrew Preview` after `CI`. `Renovate` took 3m30s on the analyzed run and is not a branch-protection check, yet it consumes Actions concurrency on every push; when the account's concurrent-runner pool is saturated, critical-path jobs queue behind work that does not gate anything. Moving `Renovate` to a cron schedule instead of a push trigger frees that capacity for the jobs that actually gate merge and publish. Queue time does not show up in per-job durations but is real in wall clock.

### Path filters over-trigger on shared tool config [#path-filters-over-trigger-on-shared-tool-config]

The old `rust` path filter in <RepoFile path=".github/workflows/ci.yml">.github/workflows/ci.yml</RepoFile> included every <RepoFile path="mise.toml" /> change, so bumping a docs-only tool version fired the entire Rust CI surface. This branch removed the stale per-tool cargo-install cache keys and now routes <RepoFile path="mise.toml" /> through small classifiers: Rust CI turns on only when Rust-relevant mise entries (`zig` or the cargo tooling aliases) change, and preview turns on only when release-build tooling entries (`zig`, `cargo-zigbuild`, `cosign`, or `syft`) change.

## Implementation Phases [#implementation-phases]

The phases below are grouped by change class and risk; the numbers are identifiers, not a priority order. For wall-clock impact, work the highest-leverage cuts first regardless of phase number. The two co-critical \~12-minute lanes (lint and test) and the cross-workflow preview idle dominate the analyzed run, so they lead the recommended order:

| Rank | Work                                                                                                                  | Phase   | Estimated wall-clock effect                                                           | Risk                                        |
| ---- | --------------------------------------------------------------------------------------------------------------------- | ------- | ------------------------------------------------------------------------------------- | ------------------------------------------- |
| 1    | Done in this branch: delete the redundant `check-all-features` job and let `clippy` own the all-features compile gate | Phase 9 | removes \~one full workspace compile (\~6m) from the lint lane                        | low -- clippy is a strict superset of check |
| 2    | Collapse or bucket the 19-package nextest matrix                                                                      | Phase 2 | removes \~14-18 redundant cache restores and shrinks the tail that gates `docker-e2e` | medium -- benchmark attribution             |
| 3    | Overlap the preview build with CI, or fold it into the CI DAG                                                         | Phase 4 | removes \~12.5m of cross-workflow idle before publish                                 | medium -- publish gating must stay exact    |
| 4    | Warm persistent-runner fast lane for Rust changes                                                                     | Phase 8 | minutes to sub-minute for incremental Rust changes                                    | high -- fork/secret isolation               |
| 5    | Prebuilt cargo tools and a baked CI image                                                                             | Phase 1 | removes cold tool compiles and per-job setup                                          | low                                         |
| 6    | zigbuild every target, evict macOS runners from release                                                               | Phase 7 | drops slow, costly macOS runners off the release path                                 | medium                                      |
| 7    | Prebuild `shellfirm` as a per-version artifact                                                                        | Phase 5 | removes a from-source compile from every construct build                              | low-medium                                  |

Measurement (Phase 0) still comes first in practice: capture a lane's baseline before and after each change so the estimates above are replaced with evidence.

### Phase 0 -- Measurement and cache truth [#phase-0----measurement-and-cache-truth]

1. Done in this branch: add workflow timing/cache summaries to `CI`, `Docs`, `Construct Image`, `Publish Homebrew Preview`, `Release`, `jackin-dev`, `Hygiene`, `Renovate`, and `Renovate Validate`. The shared summary script writes workflow wall time, target-gate time, longest jobs, longest steps, lane aggregate job time, step category totals, GitHub Actions cache budget usage, cache hit/miss/restore markers from job logs, dependency-download markers, third-party compile/check/build markers, source-tool compile markers, nonzero `sccache` issue markers, low-utility `sccache` markers, prepared-workspace artifact markers, and links to the GitHub job records into the GitHub step summary.
2. Done in this branch: add short-retention `cargo --timings` artifacts for `clippy`, `check-default`, nextest prepare, preview builds, and release builds.
3. Done in this branch: compile-heavy jobs that use `sccache` emit `sccache --show-stats` to the step summary and upload raw stats artifacts. Low hit rate is measurement data only; correctness failures still fail the job.
4. Done in this branch: track the target metric explicitly. The shared timing summary reports time to first red signal and maps each workflow to its target completion point: CI required gate, Docs required gate, Construct required gate, preview publish, GitHub release publish, and final release pipeline completion.
5. Done in this branch: for every CI/CD proof run, the shared timing summary scans both normal GitHub logs and Velnor `job-log` artifacts for crates.io index updates, crate downloads, external-crate compile/check/build markers, source-tool compiles, cache misses, failed restores, nonzero `sccache` errors, low-utility `sccache` stats, and prepared-workspace downloads before claiming the run is optimized. The report is mandatory whether the run succeeds or fails, and it compares GitHub and Velnor per runner, per job, and per important step: wall time from workflow start, job runtime, cache restore/save time, dependency download markers, dependency compile markers, source-tool compile markers, `sccache` stats, artifact upload/download time, Docker time, and the long pole. Any marker must be acted on in the same iteration when a cache, routing, or runner change can remove it; otherwise record why it is an unavoidable first-run, changed-input, cache-budget miss, or accepted hosted-runner compiler-cache limitation. The target is fastest wall clock, so a cache change that removes compilation but adds more restore/download time is a failed optimization until measurements prove otherwise. Treat GitHub's finite Actions cache budget as part of the design: prefer shared dependency keys that can restore from `main` into PR branches, avoid multiple caches containing the same registry/target data, and record cache usage/eviction pressure whenever a run shows unexpected downloads or cold restores. Keep a running ledger of techniques tried, accepted, and rejected so future iterations prove that Velnor is actually faster where claimed and that the GitHub-hosted default has been pushed to the best practical performance.

### Phase 1 -- No-risk cache hygiene [#phase-1----no-risk-cache-hygiene]

1. Done in this branch: every `jdx/mise-action` call in workflows and composite actions now pins `version: "2026.6.11"` so the action's internal mise cache is reproducible instead of fetching the latest mise release on every cold cache.
2. Done in this branch: rely on `jdx/mise-action`'s cache with the stable `mise-v2` prefix for mise-managed tools, while keeping only targeted manual caches where measurement showed they own different data (`~/.bun/install/cache`, Lychee's link cache, Rust toolchain, Cargo registry, Rust target caches, construct Buildx, shellfirm, RustSec advisory DB, and macOS SDK archives). This avoids broad duplicate tool caches that would compete with the 10GB GitHub Actions cache budget.
3. Done in this branch: audited cargo-installed tools and moved reliable prebuilt tools through mise's GitHub backend: `cargo-nextest`, `cargo-deny`, `cargo-audit`, `cargo-hack`, `cargo-zigbuild`, `cargo-shear`, `cargo-fuzz`, `codebook-lsp`, `sccache`, and `shellfirm`. CI now avoids `cargo install` source compiles for those tools.
4. Done in this branch: docs jobs use `bun ci` for locked installs, keep the Bun download cache, and do not cache `node_modules` because the measured warm path is already small and deterministic.
5. Done in this branch: keep GitHub cache keys intentionally broad enough to restore from `main`, matching GitHub's documented branch/default-branch search order, add scheduled cache-size review because GitHub cache storage can become read-only when repository cache budgets are exhausted, and delete closed-PR merge-ref caches so branch-scoped caches do not evict the default-branch warm set.
6. Done in this branch: move the CI cargo tools that ship reliable release binaries off mise's source-compiling cargo backend and onto mise's GitHub release backend: `cargo-nextest`, `cargo-deny`, `cargo-audit`, `cargo-hack`, `cargo-zigbuild`, `cargo-shear`, `cargo-fuzz`, and `codebook-lsp`.
7. Done in this branch: narrow shared <RepoFile path="mise.toml" /> routing so docs-only tool bumps do not fire the Rust surface, release-tool bumps do fire preview, and `jdx/mise-action`'s stable cache key replaces per-tool source-install caches.
8. Done in this branch: remove `rust-cache` target restores from the policy/audit lanes because `cargo shear`, `cargo deny`, and `cargo audit` inspect metadata/indexes and do not reuse workspace build artifacts. Keep the shared Cargo registry cache for those lanes so lockfile/index data still restores from the same dependency key, keep only `~/.cargo/advisory-db` in the audit-specific cache to avoid restoring the registry twice, run `cargo audit` with `--no-yanked` plus `--no-fetch --stale` on advisory-cache hits so hot PR runs avoid both crates.io index updates and RustSec refetches, and make the shared registry cache own fuzz lockfiles so the fuzz lane can run with `CARGO_NET_OFFLINE=true`. The registry cache population step explicitly disables Cargo offline mode, because cold/missing cache population is the one accepted path that must download dependencies before later jobs can prove offline operation.
9. Evaluated in this branch and deferred for the hot GitHub-hosted path: run `27996372045` showed tool setup at 102s aggregate across 40 steps, with the largest per-job setup step at 5s, while cache restore, Cargo/test work, and Docker work were much larger. A baked `jackin-ci` image could still be revisited for cold-start or self-hosted fleet ergonomics, but the current hot-path bottleneck is cache restore plus real build/test/Docker execution, not mise/rustup setup. Do not add a containerized CI path until a focused benchmark proves it beats the current GitHub-hosted baseline without weakening lane parity.

### Phase 2 -- Make Docker E2E a nextest-owned lane [#phase-2----make-docker-e2e-a-nextest-owned-lane]

1. Done in this branch: move the top-level `docker-e2e` job from <RepoFile path=".github/workflows/ci.yml">.github/workflows/ci.yml</RepoFile> into <RepoFile path=".github/workflows/rust-nextest.yml">.github/workflows/rust-nextest.yml</RepoFile> as a dedicated `docker-e2e` job that runs beside the package shards instead of waiting for every package job.
2. Done in this branch: keep Docker E2E out of the generic package matrix. It is a named nextest-owned `Docker E2E smoke` job, uses the `docker-e2e` profile, keeps Docker daemon access, downloads the construct-image artifact when needed, rebuilds `jackin-capsule` from the warm cache, and points `JACKIN_CAPSULE_BIN` at that binary.
3. Done in this branch: pass construct-image state into the reusable nextest workflow so the Docker lane downloads and loads `construct-trixie-image` only when the construct image changed; otherwise it uses the published image path the tests already expect.
4. Rejected after measurement: splitting `prepare` into the real first dependency for both package tests and Docker E2E made cache ownership simple, but run `27999009032` showed a missed target cache turns it into a 5m05s serial gate. Deterministic package tests and real Docker E2E now start directly from the same semantic workspace cache so target-cache misses compile in parallel instead of blocking fan-out.
5. Evaluated in this implementation branch: `cargo nextest archive` plus archive-mode package filtering is not safe to replace the package matrix yet. `cargo nextest archive --timings --archive-file target/nextest-archives-local/jackin-capsule.tar.zst -p jackin-capsule --all-features --color=always --locked` succeeded locally, proving the build/archive path. Running it with `cargo nextest run --archive-file target/nextest-archives-local/jackin-capsule.tar.zst -E 'package(=jackin-capsule)' --no-tests=pass --color=always` failed `jackin-capsule::daemon::tests::command_stdout_trimmed_returns_trimmed_stdout`, because archive extraction does not provide the same checkout-local context as the normal package job.
6. Done in this branch: preserve the current middle-ground package matrix instead of replacing it with archive/partition fan-out. The current shape already keeps heavy-crate attribution, avoids most of the old 19-way cache restore overhead, and preserves the checkout/workspace assumptions used by the existing test suite. Revisit archive/partition only after tests that depend on repository-local context are made archive-safe.
7. Done in this branch for the adopted middle-ground matrix: keep `jackin-capsule` visible as its own package lane while grouping low-cost crates into `small-crates`. If the package matrix is later replaced by partitions, keep a dedicated capsule partition or named report entry so capsule regressions remain obvious.
8. Done in this branch: carry the old path-filter intent forward for Docker E2E. The `docker_e2e` route now triggers the real Docker lane for `docker/**`, Docker runtime assets, launch/runtime code, capsule handoff code, and <RepoFile path="crates/jackin/tests/dind_e2e.rs">crates/jackin/tests/dind\_e2e.rs</RepoFile>, while unrelated Rust package changes still receive deterministic tests without forcing Docker E2E.
9. Done in this branch: collapse the 19-package matrix to the middle-ground shape from this roadmap -- named jobs for `jackin`, `jackin-capsule`, `jackin-tui`, and `jackin-runtime`, plus one `small-crates` bucket. This keeps per-crate attribution for the heavy crates while removing most redundant target-cache restores.
10. Reverted after measurement in this implementation branch: uploading and downloading the prepared workspace made the fan-out slower. Docker E2E now rebuilds `jackin-capsule` from the restored warm cache, marks it executable, and exports `JACKIN_CAPSULE_BIN`; run `27976780904` measured that rebuild at about 6s while the rejected handoff path spent up to 4m09s downloading the prepared workspace.
11. Rejected after measurement: removing the explicit <RepoFile path=".github/actions/cache-cargo-registry/action.yml">Cargo registry cache</RepoFile> restore from nextest package and Docker E2E fan-out jobs made run `27991767807` fail under `--offline` immediately after the shared `rust-cache` restore. The restore costs a few seconds per fan-out job on warm runs, but it is currently the correctness guard that keeps package tests and Docker E2E from downloading dependencies. Any future attempt to remove it needs an equivalent offline proof in every fan-out lane before landing.
12. Done in this branch: making nextest `prepare` a cold-cache seeding job instead of a mandatory hot-path workspace rebuild helped hot exact-cache runs, but it still serialized fan-out on cache misses. Run `27997272508` confirmed the hot-cache skip (`prepare` 53s, build step skipped), run `27998091712` found Docker E2E local workspace compiles after a generic cache hit, run `27998435131` rejected a Docker-specific key while the serialized `prepare` gate still existed, and run `27999009032` rejected that remaining serialized gate because a missed shared key made `prepare` spend 5m05s compiling before package and Docker lanes could start. The branch removes the `prepare` job and lets shards restore caches directly. Direct-fanout warm runs then showed that one generic package cache is not exact enough for every package or Docker graph, so package shards now get `ci-all-features-dev-workspace-v2-package-<group>` and Docker E2E gets `ci-docker-e2e-dev-workspace-v1`; each cache is GitHub-hosted-lane owned so one narrow graph cannot seed another shard's target archive. Warm runs `28039373389` and `28040274004` accepted the shape: both had zero dependency downloads, zero third-party compile/check/build markers, zero source-tool compiles, and zero prepared-workspace restores, while keeping the required GitHub lane green.

### Phase 3 -- Host Rust compiler cache adoption [#phase-3----host-rust-compiler-cache-adoption]

1. Done in this branch: add pinned `sccache` installation through mise's GitHub release backend, avoiding `cargo install sccache` in CI because compiling the cache tool on cold runners defeats the purpose.
2. Rejected after measurement: the GitHub Actions cache backend for `sccache` produced 0% hits and write errors in repeated hosted-runner runs, so hosted jobs no longer enable it. Keep `CARGO_INCREMENTAL=0` in compile-heavy lanes for deterministic target caches and for Velnor local `sccache`.
3. Done in this branch: keep local `sccache` available only on the opt-in Velnor lane, where the backend is persistent disk instead of the hosted GHA cache service.
4. Done in this branch: compile-heavy jobs that use `RUSTC_WRAPPER=sccache` emit `sccache --show-stats` to the step summary and upload the raw stats as short-retention artifacts on both GitHub and Velnor lanes. GitHub-hosted stats are diagnostic with `SCCACHE_GHA_ENABLED=off`; Velnor stats prove the local-disk accelerator. Treat low hit rate as data, not a failure; fail only on build correctness.
5. Corrected after measurement: keep `RUSTC_WRAPPER=sccache` in GitHub-hosted compile-heavy CI and nextest jobs for target-cache compatibility, but disable the hosted-runner GHA compiler-cache backend with `SCCACHE_GHA_ENABLED=off`. The wrapper is part of the Cargo fingerprint shape; the rejected piece is the hosted GHA backend, not the wrapper itself.
6. Rejected after measurement: broad read namespaces, job-specific run-unique write keys, and `SCCACHE_BASEDIRS` still produced 0% hits and write errors on hosted GitHub runs. The durable rule is now narrower: no hosted-runner GHA `sccache` backend; keep wrapper-compatible target caches on GitHub and use local-disk `sccache` as the opt-in Velnor accelerator.
7. Rejected in this branch: hosted-runner GHA `sccache` stays out of the GitHub default unless a future upstream/backend change is proven faster than `rust-cache` plus registry caching. Pilot any future compiler-cache work on Velnor local disk, S3, or Redis first.

### Phase 4 -- Main push and preview pipeline overlap [#phase-4----main-push-and-preview-pipeline-overlap]

1. Done in this branch: start preview archive builds on `push` to `main` in parallel with `CI`, keyed by source SHA.
2. Done in this branch: gate only the mutation steps -- rolling preview release update and Homebrew tap update -- on a positive CI conclusion for that exact SHA.
3. Done in this branch: remove the duplicate Linux `jackin-role` validator builds from `CI`. Preview already builds and packages `jackin-role` in the signed `jackin` archives for every release target, and `publish-preview` still waits for both those archive jobs and the matching successful `CI` run before mutating the rolling preview.
4. Done in this branch: keep preview source-path filtering; docs-only pushes should not publish preview binaries.
5. Done in this branch: add a final SHA ancestry check before publishing, as the preview workflow already does, so a stale or superseded build cannot update the rolling preview.
6. Evaluated in this branch and deferred: folding preview into the `CI` DAG is no longer needed for this PR's main speed win. The duplicate `build-validator` job is gone, preview builds start on `push` and publish only after successful exact-SHA CI, and the separate workflow keeps publish mutation isolated. Revisit one-DAG preview only if future timing summaries show cross-workflow polling latency or cache separation is again a measured long pole.

### Phase 5 -- Docker and construct-image speedups [#phase-5----docker-and-construct-image-speedups]

1. Done in this branch: add the BuildKit GHA cache `ghtoken`/`repository` parameters to manual Buildx GHA cache refs in CI Docker E2E and construct PR builds, reducing GHA cache API throttling risk without switching to another action.
2. Done in this branch: the cache-mount experiment for the construct Dockerfile's old cargo registry/git and `/sccache-build` mounts is superseded. `shellfirm` is now staged before Docker Buildx, so the Dockerfile no longer has a Rust toolchain, `security-tools` stage, or cargo cache mounts to preserve.
3. Done in this branch: keep the registry cache as the primary main-branch warm source and GHA cache as the PR iteration cache. Docker's registry cache backend is the better long-lived multi-stage cache; GHA cache is convenient but eviction- and rate-limit-prone.
4. Done in this branch: keep per-platform cache refs/scopes. Multi-platform cache writes to one mutable ref are easy to race; the current `buildcache-amd64` and `buildcache-arm64` shape is the right default.
5. Done in this branch: track the `shellfirm` prebuilt-binary TODO for arm64 in <RepoFile path="TODO.md">TODO.md</RepoFile>. Avoiding the compile inside the construct image is already achieved by staging the pinned binary before Docker Buildx; the TODO remains only for replacing the CI-built arm64 binary with a direct upstream release-asset download once upstream publishes one.
6. Done in this branch: until a prebuilt `shellfirm` exists for every target architecture, construct CI restores or builds the pinned `SHELLFIRM_VERSION` once per runner architecture through the GitHub Actions cache, stages it at `docker/construct/prebuilt/shellfirm`, and lets <RepoFile path="docker/construct/Dockerfile">docker/construct/Dockerfile</RepoFile> `COPY` it in. This removes the Dockerfile's `cargo install shellfirm` stage and sidesteps the BuildKit cache-mount-persistence question in step 2 for this stage. The upstream arm64 prebuilt TODO in step 5 remains open for eventually replacing the CI-built binary with a direct release-asset download.

### Phase 6 -- Docs and prose checks [#phase-6----docs-and-prose-checks]

1. Done in this branch: keep `repo-link-check` on every non-schedule event; it is cheap and catches source renames that path filters would otherwise miss.
2. Done in this branch: keep full docs build plus lychee for docs changes. The analyzed docs run spent 39s building and 87s checking built links, which is acceptable for the coverage it provides.
3. Done in this branch: make Codebook warm through the prebuilt `codebook-lsp` mise GitHub backend, remove the old Rust toolchain/Cargo registry cache restores from docs spell-check jobs, and rely on the Docs timing summary to measure the warm result. If warm Codebook remains expensive, split into a fast changed-file PR pass plus a full scheduled/main pass, with `docs-required` still requiring the full pass when docs/prose actually changed.
4. Done in this branch: avoid caching `node_modules` unless measured. Bun's docs recommend `bun ci`/`--frozen-lockfile` for reproducible CI; the current Bun download-cache path keeps installs deterministic and small.

### Phase 7 -- Release workflow speedups [#phase-7----release-workflow-speedups]

1. Done in this branch: share the preview build implementation with tagged release builds through <RepoFile path=".github/actions/build-release-archive/action.yml">.github/actions/build-release-archive/action.yml</RepoFile>, so preview and release use the same `cargo-zigbuild`, package, SHA256, signing/SBOM/attestation, timings, and artifact-upload contract while keeping their separate version names and publish gates.
2. Done in this branch: start release artifact builds in parallel with the release test workflow, then gate `gh release create`, signing publication, and Homebrew stable formula mutation on tests plus all build jobs passing. This preserves release safety while avoiding idle build machines.
3. Done in this branch: keep final release artifacts on deterministic settings. Archive build jobs keep `CARGO_INCREMENTAL=0`, pinned toolchain, pinned SDK/tool versions, and unchanged SBOM/signing/attestation, while the shared archive action now normalizes tar entry order, mtime, owner/group metadata, numeric owners, and gzip header names.
4. Done in this branch: extend `sccache` to release archive builds using the same release-profile archive action and short-retention stats artifacts as preview, while keeping deterministic release settings and `Swatinem/rust-cache` in place.
5. Done in this branch: standardize on `cargo-zigbuild`-from-Linux for every release target, including macOS, and drop the `macos-latest` runners from the release critical path. Preview already cross-compiles macOS this way with a cached SDK, so this also collapses preview and release onto one build implementation. Keep a single native-macOS build-and-test in the scheduled hygiene lane for parity, so a macOS-specific link or runtime regression is still caught -- just not on the release critical path. Unify the `Swatinem/rust-cache` `shared-key` across the preview and release build jobs so one target triple warms a single cache instead of `build-preview-<target>` and `build-<target>` never sharing.

### Phase 8 -- Warm persistent-runner fast lane (highest-leverage Rust speedup) [#phase-8----warm-persistent-runner-fast-lane-highest-leverage-rust-speedup]

1. Done in this branch: keep GitHub-hosted runners as the trust boundary for public/fork PRs and as the required parity lane.
2. Done in this branch: keep the existing `velnor` lane opt-in through `workflow_dispatch` and `lanes: both`; never make it the default for jackin' CI. The win is still structural: a warm `target/`, warm `~/.cargo`, and warm Docker layers mean an incremental Rust change recompiles only the edited crate, which no hosted-runner cache strategy can match.
3. Done upstream in Velnor: opt-in persistent `target/` stores are now scoped under `_velnor_targets/<trust-scope>/<repo>/<workflow>/<job-bucket>`, so warm build state cannot cross trust scope, repository, workflow, or job-class boundaries, Velnor job containers now receive daemon-level CPU and memory caps, and commit `2013110` adds runtime trust-scope enforcement. `trusted` daemons keep the full warm-runner capability set; non-trusted scopes reject jobs that carry user/repository secrets such as `secrets.*`, and job/action containers in those scopes do not receive the shared host Docker socket. Operators still keep distinct `VELNOR_TRUST_SCOPE` values plus runner labels/groups for trusted and untrusted lanes, but the runner now enforces the boundary instead of relying only on deployment discipline.
4. Done in this branch: keep a required GitHub-hosted parity job so a green warm-lane run still proves a green cold-lane run, satisfying the PR/main parity rule.
5. Done in this branch: pair `sccache` with the lane backend. GitHub-hosted jobs do not enable a compiler-cache backend because the measured GHA backend stayed at 0% hits with write errors; Velnor jobs enable local `sccache` so the injected `SCCACHE_DIR=/var/cache/sccache` can use the host-persistent cache and stack with warm `target/`.
6. Done in this branch: treat this as the largest single wall-clock lever in the roadmap, gated entirely on the security model in step 3; sequence it after the cheap host-CI cuts (Phase 9, Phase 2) so the hosted lanes are already fast while the persistent lane is hardened.
7. Done in this branch: hold every change in this roadmap to dual-runner parity capability. It passes on a GitHub-hosted run by default and on a `velnor` run when explicitly exercised with `lanes: both`, with the GitHub-hosted run remaining the required check. Velnor-only turbo (warm `target/`, local `sccache`, warm Docker layers) is allowed only as an opt-in accelerator over that shared baseline.
8. Done in this branch for construct and Docker lane-awareness: <RepoFile path=".github/workflows/construct.yml">.github/workflows/construct.yml</RepoFile> now has the same `lanes` dispatch selector, GitHub remains the default and the only manifest-publish source, Velnor can rehearse the amd64 build path, and Docker E2E passes on both lanes through `lanes: both`. The native construct `arm64` leg remains GitHub-hosted until Velnor has an arm path. Continue improving `velnor` itself when a job needs a missing capability: add an arm path if native arm parity is required, keep extending warm-state hygiene, and keep comparing Docker E2E as a measured lane because it can be slower on Velnor even when dependency and cache markers are clean.

### Phase 9 -- Host lint/check critical-path de-serialization [#phase-9----host-lintcheck-critical-path-de-serialization]

1. Done in this branch: remove the `needs: check-all-features` edge from `clippy` and `check-default` in <RepoFile path=".github/workflows/ci.yml">.github/workflows/ci.yml</RepoFile> so compile-heavy jobs are not serialized behind a cache they do not share.
2. Done in this branch: delete `check-all-features` entirely. `cargo clippy --workspace --all-targets --all-features` runs the full type and borrow check before linting, so it is a strict superset of `cargo check --workspace --all-targets --all-features`; keeping `clippy` as the all-features gate and `check-default` as the default-feature gate preserves coverage while removing a redundant compile-heavy job and its cache.
3. Evaluated in this branch and deferred: the old fail-fast edge is not worth reintroducing on the green path. `fmt`, `actionlint`, schema checks, direct nextest shards, and `clippy` already provide early red signal without serializing all compile-heavy work. Add a single quick `cargo check` on one core crate only if future red-run measurements show wasted runner minutes that exceed the extra green-path latency.
4. Done in this branch: capture the lint-lane wall clock before and after per Phase 0. The old serialized lint lane took about 3m56s from `cargo check all features` start to `check-default` completion on `27937532691`; the de-serialized lane took about 51s from `clippy`/`check-default` start to the slower completion on `27961783994`. The full CI required gate moved from about 12m33s to about 10m16s, and the remaining critical path is construct E2E image build plus the direct nextest package/Docker fan-out.

## Guardrails [#guardrails]

* Keep one stable aggregator per workflow for branch protection, but allow separate early-signal jobs to finish before the full aggregator.
* Do not remove a check unless it is moved to an equal or stronger gate. "Faster" must not mean "main learns later that release cannot build" without an intentional publish gate.
* Keep preview and release publish steps hard-gated to `main` or tag/manual-release rules.
* Keep tool installation through mise or a first-party wrapper that is documented in <RepoFile path="mise.toml">mise.toml</RepoFile>; do not add ad hoc language setup actions to workflow files.
* Keep cache keys observable. Every new cache must have a documented owner, invalidation input, and expected fallback behavior.
* Set `timeout-minutes` on every job. This branch now applies explicit caps across CI, Docs, construct, preview, release, reusable nextest, Renovate, and scheduled hygiene jobs so a hung network call or wedged process cannot burn to the multi-hour default. A per-job timeout is a cost and safety floor, not a speed change.
* Do not gate the cheap deterministic jobs in front of the heavy ones to "fail first". `fmt`, `actionlint`, and `schema-check` already run in parallel and return in seconds; serializing the compile-heavy jobs behind them would add their latency to the green path for no green-path benefit. Fast-fail is a red-path optimization -- keep it off the happy path.
* Keep non-critical workflows off the per-push runner pool. Schedule `Renovate` rather than triggering it on every push so it cannot queue ahead of the jobs that gate merge and publish.
* Dual-runner parity capability is mandatory. Every job must stay runnable on both GitHub-hosted runners and the self-hosted `velnor` lane when a maintainer explicitly selects `lanes: both`, with the GitHub-hosted run as the default and required parity gate. Velnor may carry heavier optimizations than hosted runners can (warm caches, a local compiler cache), but only as an opt-in accelerator over a baseline that still passes on hosted runners. When Velnor cannot do something a job needs, improve Velnor itself and re-verify both lanes -- do not fork the pipeline or quietly drop the job to hosted-only.

## Upstream Notes [#upstream-notes]

* GitHub Actions cache searches the current branch first, then restore-key prefixes, then the default branch, which is why broad restore keys can deliberately warm PR branches from `main`; cache storage can also become read-only when budgets/limits are exhausted. [GitHub dependency caching docs](https://docs.github.com/en/actions/reference/workflows-and-actions/dependency-caching)
* Docker Buildx supports `cache-to mode=max` to export more layers than `mode=min`, and Docker documents both registry and GitHub Actions cache backends. Docker also documents that BuildKit cache mounts are not preserved in GHA cache by default. [Docker cache backends](https://docs.docker.com/build/cache/backends/), [Docker GHA cache backend](https://docs.docker.com/build/cache/backends/gha/), [Docker cache mounts in Actions](https://docs.docker.com/build/ci/github-actions/cache/#cache-mounts)
* Mozilla `sccache` is a compiler wrapper cache with local and cloud/GHA-style storage backends; Rust usage is through `RUSTC_WRAPPER`, and Rust compiler caching requires incremental compilation to be disabled. [sccache](https://github.com/mozilla/sccache), [sccache action Rust notes](https://github.com/mozilla-actions/sccache-action)
* Bun documents `bun ci` as equivalent to `bun install --frozen-lockfile` for reproducible CI installs from committed `bun.lock`. [Bun install docs](https://bun.com/docs/pm/cli/install)
* mise's CI docs recommend pinned tool versions for reproducible CI environments, and `jdx/mise-action` supports install arguments and caching. [mise CI docs](https://mise.jdx.dev/continuous-integration.html), [jdx/mise-action](https://github.com/jdx/mise-action)
* mise's GitHub backend installs prebuilt release binaries, so CI tools that ship reliable GitHub releases need not be `cargo install`-compiled. [mise backends](https://mise.jdx.dev/dev-tools/backends/)
* `Swatinem/rust-cache` supports `shared-key`, `cache-workspace-crates`, and rust-environment hashing, which match the current direct nextest package/Docker fan-out design. [rust-cache README](https://github.com/Swatinem/rust-cache)
* cargo-nextest supports build archives and partitioning so a build can be reused while test execution is split across workers. [nextest archiving](https://nexte.st/docs/ci-features/archiving/), [nextest partitioning](https://nexte.st/docs/ci-features/partitioning/)
* Cargo's profile docs describe incremental compilation, `CARGO_INCREMENTAL`, and default codegen-unit differences; use those as the basis for any incremental-compilation experiment. [Cargo profiles](https://doc.rust-lang.org/cargo/reference/profiles.html)
* The self-hosted fast lane is powered by the `velnor` project; missing runner capabilities are added there so both lanes stay at parity rather than forking pipeline behavior. [tailrocks/velnor](https://github.com/tailrocks/velnor)

## Related Files [#related-files]

* <RepoFile path=".github/workflows/ci.yml">
    .github/workflows/ci.yml
  </RepoFile>
* <RepoFile path=".github/workflows/rust-nextest.yml">
    .github/workflows/rust-nextest.yml
  </RepoFile>
* <RepoFile path=".github/workflows/docs.yml">
    .github/workflows/docs.yml
  </RepoFile>
* <RepoFile path=".github/workflows/construct.yml">
    .github/workflows/construct.yml
  </RepoFile>
* <RepoFile path=".github/workflows/preview.yml">
    .github/workflows/preview.yml
  </RepoFile>
* <RepoFile path=".github/workflows/release.yml">
    .github/workflows/release.yml
  </RepoFile>
* <RepoFile path=".github/workflows/jackin-dev.yml">
    .github/workflows/jackin-dev.yml
  </RepoFile>
* <RepoFile path="docker/construct/Dockerfile">
    docker/construct/Dockerfile
  </RepoFile>
* <RepoFile path=".config/nextest.toml">
    .config/nextest.toml
  </RepoFile>
* <RepoFile path="crates/jackin/tests/dind_e2e.rs">
    crates/jackin/tests/dind_e2e.rs
  </RepoFile>
* <RepoFile path="crates/jackin-capsule/Cargo.toml">
    crates/jackin-capsule/Cargo.toml
  </RepoFile>

## Cross-references [#cross-references]

* [/reference/roadmap/rust-ci-tooling/](/reference/roadmap/rust-ci-tooling/) -- dependency hygiene, Codebook, coverage, and release-time tooling.
* [/reference/roadmap/workspace-registry-cache/](/reference/roadmap/workspace-registry-cache/) -- local pull-through Docker registry ideas for runtime workloads.
