Skip to content

jackin' Capsule: In-Container Control Plane

Status: Partially implemented — Phases 1–3 shipped, Phase 4 (host daemon integration and Desktop Agent Hub bridge) remains open. Phase 3 landed the jackin’ Capsule: an in-container PTY control plane built on the vt100 crate with a Zellij-style dirty-row renderer, a tmux-style prefix-key model (Ctrl+B opt-in via JACKIN_PREFIX, including prefix Ctrl+L clear-pane), a persistent PID 1 daemon that exits cleanly when the last session ends, binary tag+length attach framing on the hot path, single-client takeover, mode-state restore on focus swap, OSC passthrough (52/9/2/8) gated to the focused pane, top chrome with a brand pill and hover-lifted tab strip, and a bottom branch/PR context bar for non-default branches with hover-lifted click targets for GitHub context and container details.

The shipped-Phase-3 design and operator-visible behaviour now live in:

The remainder of this page is the original Phase 1–3 design rationale, kept for the design-history record. Some lower sections intentionally describe the problem statement and rejected intermediate designs in past-tense context. The canonical “what the Capsule does today” pages are the three links above.

Before the Capsule rewrite, the container supervisor was a bash wait loop (docker/runtime/supervisor.sh, deleted alongside this Phase) that kept the container alive while agent sessions ran via docker exec. It had two immediate limitations and one deeper architectural gap.

Last-session cleanup does not fire (resolved in Phase 1). When a tmux session exits, the supervisor was keeping the container running. Because the container state is Running rather than Stopped, the host jackin cleanup path (finalize_foreground_session, container teardown, DinD/network/certs removal) would not fire automatically. The operator had to explicitly run jackin eject to clean up a container after all sessions end.

Session inventory requires shelling out. Querying which sessions are active requires docker exec <container> sh -c 'tmux list-sessions ...' from the host. There is no structured interface for the host CLI or the future jackin’ daemon to ask “what is running in this container right now?”

tmux is not designed for the container-per-agent model. tmux has no concept of agent state, no structured control plane, and no event stream. Every interaction requires docker exec tmux ... round-trips. jackin’ cannot ask “is this agent blocked or still working?” without reading raw terminal output from the outside. This is the root reason observability, attention prompts, and desktop app integration are hard to build on top of the current architecture. A purpose-built process inside the container — one that owns the session lifecycle from the start — can expose exactly the information jackin’ needs over a structured protocol.

Each Capsule-managed role container runs jackin-capsule as PID 1. It is the Capsule control plane: it manages PTY sessions directly (replacing tmux), tracks session state, renders the in-container multiplexer, and exposes a Unix socket API that the host CLI, the future jackin’ daemon, and the jackin’ desktop app can drive.

The container becomes a self-contained server. The operator’s terminal, the console TUI, and the desktop companion all talk to the same control plane:

  • Spawn a session (new agent or shell) — without shelling into the container
  • Kill a session — terminate a specific agent or shell
  • Query status — which sessions are running, which agent is blocked, which is done and waiting for review
  • Get session title — the current terminal title or process name for each session
  • Attach to a session — connect a PTY client to a running session
  • Subscribe to events — session started, session ended, agent state changed

This makes jackin-capsule the data source for all future observability features: attention prompts, live agent state in the console, the desktop app’s session panel, and the daemon’s reactive instance index.

The binary is named jackin-capsule and lives at /jackin/runtime/jackin-capsule inside every Capsule-managed role container. It replaces the deleted docker/runtime/supervisor.sh bash wait loop as PID 1.

Workspace member: crates/jackin-capsule/. Produces one Linux binary. The crate is independent from the main jackin crate — it shares protocol shapes through jackin-protocol, so the host and Capsule compile against the same control-channel request and reply types.

The binary is compiled in CI for the container target (linux/amd64, linux/arm64) and published to the jackin GitHub Releases alongside the main CLI binary. The version is always pinned to the same version as the jackin CLI that built the derived image — they are released together.

Before the derived image build, jackin resolves the jackin-capsule binary for the target architecture (JACKIN_CAPSULE_BIN override, cache hit, or GitHub Release download), writes it to the build context, and the derived Dockerfile installs it at /jackin/runtime/jackin-capsule. The download is cached per jackin version.

As PID 1, jackin-capsule must:

  1. Reap zombie children. Processes whose parent exits become children of PID 1. The binary must call waitpid(-1, WNOHANG) in a loop on SIGCHLD to reap them. tokio does not do this automatically.
  2. Forward signals cleanly. SIGTERM from docker stop → exit 0. SIGINT → exit 0. Both trigger the session-end path.
  3. Never crash on unexpected input. PID 1 death kills the entire container. All error paths either log and continue or exit deliberately.

The binary listens on a Unix domain socket at /jackin/run/jackin.sock inside the container. The host bind-mounts a per-instance socket directory:

docker run ... -v ~/.jackin/sockets/<container-name>:/jackin/run ...

The daemon creates jackin.sock inside that directory, so the host-side path is ~/.jackin/sockets/<container-name>/jackin.sock.

Protocol: The same socket carries two channel types selected by the first byte. The control channel uses 0x00, a 4-byte big-endian length prefix, and a JSON body for one-shot status and snapshot requests. The attach channel is persistent and uses tag-plus-length-prefix binary framing for hot-path PTY bytes. The two channels share the socket so the host needs only one mount. See “Wire protocol” under “Multiplexer architecture (Phase 3)” below for the binary frame layout.

The original target API was:

MethodPhaseDescription
status2List all sessions with name, agent type, created-at, and state
session.create3Spawn a new agent or shell session
session.kill3Terminate a session by ID
session.title3Read the current terminal title or process name for a session
session.attach3Return a PTY attachment handle so the client can connect
events3Upgrade connection to a streaming event channel

Deferred event stream: session-started, session-ended, all-sessions-ended, agent-state-changed {session_id, state}. The shipped Phase 3 control channel exposes one-shot status and snapshot; streaming state belongs to Phase 4 daemon integration.

Each session tracks one of four states. These states are inferred from PTY output activity and foreground process state — no agent hooks or configuration required.

StateMeaning
workingOutput flowing or foreground process actively running
blockedSilent for N seconds with a foreground process present — waiting for operator input
doneWork finished; the operator has not yet reviewed the output
idleReviewed or no work in progress

The two-stage done / idle split is important: a done slot should not be automatically refilled by the autonomous task queue or cleaned up from the console until the operator has acknowledged the output. This distinction drives the “ready for review” indicator in the desktop app and the dispatch logic in future autonomous queue work.

Phase 2 (Rust binary, tmux still present): the binary watches the tmux server socket at /tmp/tmux-<uid>/default via inotify. Socket deleted → all sessions ended → exit 0. This replaces the 1-second bash polling loop from Phase 1.

Phase 3 (multiplexer phase): the binary owns session lifecycle directly. No tmux. The shipped daemon removes exited panes, keeps other sessions running, and exits cleanly when no live sessions remain so the host cleanup path can tear down the role container, DinD sidecar, cert volume, and network. An earlier Phase 3 design considered keeping exited tabs around until SIGTERM; that was dropped because it weakened the last-session cleanup contract.

  • No structured API — every tmux interaction from the host requires docker exec tmux ..., a subprocess round-trip with string output that must be parsed.
  • No agent-state awareness — tmux reports session names and windows; it has no concept of whether the process inside is blocked, working, or done.
  • No event stream — the host cannot subscribe to “a session ended” without polling.
  • tmux adds binary size to the image, startup overhead, and a dependency jackin’ does not control.

A purpose-built multiplexer in jackin-capsule gives jackin’ a clean control plane: structured socket, typed events, agent-state inference, and no external process to coordinate with.

This section defines the multiplexer architecture in detail. The first Phase 3 attempt is being replaced wholesale — every module is either rewritten or deleted — so this section is the spec the rewrite implements rather than a description of code that exists today.

The first attempt produced a working PID 1 binary that opens PTYs, draws a status bar, manages a binary pane tree, and frames messages over a Unix socket. The shape was right; the execution was not. Five categories of defect make the result unusable for modern TUI agents and force a ground-up rewrite rather than incremental fixes:

  1. Hand-rolled VT emulator without alt-screen support. crates/jackin-capsule/src/terminal.rs (deleted in the Phase 3 rewrite — replaced by session.rs) implements the vte::Perform trait but its csi_dispatch ignores the h and l actions and never inspects the ? intermediate, so every DEC private mode is silently dropped — including \x1b[?1049h (alternate screen), \x1b[?25h/l (cursor visibility), \x1b[?1 (application cursor keys), \x1b[?2004 (bracketed paste), and \x1b[?1000-1006 (mouse modes). Modern TUIs enter alt-screen, expect application cursor keys, and round-trip bracketed paste; without them the visible pane is at best garbled and at worst blank. Repairing the emulator would mean reimplementing most of vt100 or alacritty_terminal. A real VT state library is the right answer.
  2. Ctrl+J = 0x0A = line feed. crates/jackin-capsule/src/input.rs intercepts byte 0x0A and opens the command palette. That byte is identical to a literal newline; any LF in pasted text, in canonical-mode input, or from a TUI emitting \n opens the palette and is consumed. Operators effectively cannot send Enter to the agent on any terminal that doesn’t strictly send \r for the return key. The fix is a tmux-style prefix-key state machine, not a single hardcoded byte.
  3. Daemon exits when all sessions die. crates/jackin-capsule/src/daemon.rs calls std::process::exit(0) once sessions.values().all(|s| !s.alive) becomes true. Because the daemon is PID 1, that one call brings down every other process in the container and ends the container itself. An agent that crashes during the first second of launch — common while iterating on roles — looks to the operator like “jackin’ keeps exiting on me.” The daemon must persist until SIGTERM.
  4. Mouse passthrough is a literal no-op. The InputEvent::MousePress { .. } branch in crates/jackin-capsule/src/daemon.rs outside row 0 reads let _ = session; and returns None; the mouse event is discarded. The branch above it correctly handles row-0 (tab strip) clicks. The body for “anywhere else” was never written.
  5. Hot-path framing is full-screen redraw, base64, and JSON. Every PTY chunk runs compose_frame which rebuilds the entire rows × cols cell grid as ANSI, encodes the result with a hand-rolled base64 (also a violation of the “prefer libraries” rule in AGENTS.md), wraps the base64 in a JSON ServerMsg::Output, and length-prefixes the result. An 80×24 TUI redrawing at 60 Hz amounts to megabytes per second of structurally pointless re-encoding between two processes that share a kernel. The right shape is: send PTY bytes through almost unchanged on the hot path, and only run the full-screen compose path on tab switch, pane switch, or client reattach.

The same module also leaks several smaller defects — single-tab borders are always drawn, status-bar tabs append / after the click-region width is computed (so click targets drift), the reattach path drops the outbound channel sender, the PTY writer task uses rt.block_on(input_rx.recv()) inside spawn_blocking, and the client never propagates SIGWINCH — but the five categories above are the load-bearing ones. The rewrite addresses all of them by changing the architecture, not by patching the existing modules.

Zellij is the closest architectural reference for the rewrite. It is Apache-2.0 (we may study and adapt its design freely), it is written in Rust, and it solves the same problem at much larger scope. The pieces of zellij’s design jackin adopts:

  • Client-server split over a Unix domain socket. The server owns all PTYs, all VT state, all tabs/panes; the client owns user input and rendering. Multiple clients can attach to one server. Detach and reattach leave PTYs running.
  • Typed instruction bus. Inside the server, dedicated threads (PTY I/O, screen state, plugins) talk over MPSC channels carrying typed enums (PtyInstruction, ScreenInstruction). jackin’ server is single-threaded enough that the bus is overkill, but the typed-enum-over-channel shape is exactly what the existing tokio select! loop in crates/jackin-capsule/src/daemon.rs should converge on.
  • Per-pane VT state, replayed on switch. The server keeps a vt100::Screen for every pane, including non-visible ones. PTY output is fed to the screen regardless of focus; switching tabs replays the target pane from the saved screen rather than asking the application to redraw.
  • Binary protocol on the hot path. Zellij uses protobuf; jackin’ uses simpler tag-plus-length-prefix binary framing. The point is not protobuf — it is “don’t put base64-inside-JSON on the hot path.”

Zellij carries scope jackin explicitly does not need (multi-client collaboration, plugins via WASM, scrollback search, copy-mode regex, etc.). The rewrite cherry-picks the structural pieces and leaves the rest.

Concepts to borrow from herdr (license-safe restatement)

Section titled “Concepts to borrow from herdr (license-safe restatement)”

The Herdr study under “Prior art: Herdr” below lists the conceptual borrows for the agent-state model, status roll-up, notification suppression, sound escalation, blocking wait semantics, foreground-process ownership, screen heuristics, and semantic integration reports. The multiplexer rewrite adds three structural borrows on top of those — drawn from Herdr’s UI shape, not its code:

  • Top-of-screen chrome. The top chrome renders jackin' on the left followed by one tab per active jackin session. Active tab gets a distinct graphite background plus white underline/bold treatment; inactive tabs are dimmed; tab labels include the rolled-up state glyph. Operators sharing a screen recognise the brand pill at a glance without confusing it for the selected tab.
  • Empty initial state when no agent is preselected. When no initial agent argv is provided at daemon launch, the multiplexer can come up with the brand header, zero tabs, and a centred hint listing the agents from /jackin/run/agent.toml plus Shell. The operator picks one with the prefix key and the first tab spawns into that selection. When the host passes an initial agent argv, the daemon spawns the first tab with that agent automatically — matching the historical direct-into-agent UX.
  • Per-tab “most urgent” state roll-up. A tab containing any blocked pane is blocked; otherwise any done pane makes it done; otherwise working; otherwise idle. Same urgency order herdr uses. This drives the tab-strip glyph and feeds the future agent-state-changed event stream. Once a pane reaches blocked, the attention glyph stays visible until explicit operator keyboard input reaches that pane; incidental PTY output is not enough to clear it.

These are interface borrows. Herdr’s internal layout (workspaces, sidebar, pane-focus suppression rules, ghostty embedding, kitty graphics passthrough, plugin hooks) is not part of jackin’ scope and is not copied.

The model is deliberately a strict subset of tmux:

  • A session is one running jackin-capsule daemon. There is exactly one session per jackin instance and it lives for the lifetime of the container.
  • A tab is a named top-level container. Each tab has a label (the agent’s display name when first launched) and a pane tree. The tab strip in row 0 lists every tab in creation order; the operator switches with prefix bindings or by clicking.
  • A pane is one PTY plus the vt100::Screen that mirrors it. A tab starts as a single pane; splitting a pane creates a sibling. The pane tree is a binary tree of HSplit { left, right, ratio } / VSplit { top, bottom, ratio } / Leaf(pane_id). The existing tree in crates/jackin-capsule/src/layout.rs already has the right shape; the rewrite keeps it and fixes border-math edge cases (no border when a tab has a single leaf; bottom row math respects the status-bar offset).

The model has no window layer between tab and pane even though tmux does. Tmux’s window concept exists primarily because tmux predates the modern tabs-and-splits idiom; jackin’ starts from that idiom and does not need the historical layer. The model has no workspace layer above session even though herdr does. Workspaces only make sense when one server hosts many projects at once; jackin’ server hosts exactly one container.

The hand-rolled emulator in crates/jackin-capsule/src/terminal.rs (deleted in the Phase 3 rewrite — replaced by session.rs) is deleted. Replaced by vt100 — a maintained Rust crate explicitly designed for “tmux-like” use cases. It gives the rewrite all of the following for free:

  • Full DEC private mode handling — alt-screen, cursor visibility, application cursor keys, bracketed paste, mouse modes.
  • A Screen type with scrollback, attributes (fg, bg, bold, italic, underline, reverse, blink, strike, dim), and cursor state.
  • Screen::contents_formatted() and Screen::contents_diff() — emit ANSI for whole-screen origins. jackin’ studied them, but pane offsets, borders, dimming, selection, and overlays require a compositor-aware row serializer in crates/jackin-capsule/src/render.rs.
  • Screen::set_size(rows, cols) — correct resize semantics including alt-screen and reflow handling.

The Apache-2.0 license is compatible with jackin-capsule’s Apache-2.0. The crate is well-maintained, has no transitive dependency surprise, and is the canonical choice in the Rust ecosystem for this job — exactly the pattern AGENTS.md’s “Prefer libraries over hand-rolled parsers” rule demands.

jackin’ temporarily pins a forked vt100 commit while upstreaming Screen::clear_scrollback() and CSI 3J support. That behavior belongs in vt100 because the crate owns the in-memory grid and scrollback state, exposes Screen::set_scrollback, and advertises itself for screen / tmux-style applications. jackin’ should not fake scrollback erasure in the compositor or use RIS (ESC c) because both approaches conflate terminal state with pane rendering.

alacritty_terminal is a heavier alternative offering scrollback search and selection state. jackin’ first rewrite does not need either; vt100 is the right scope.

Render model: dirty pane bodies and full-frame fallback

Section titled “Render model: dirty pane bodies and full-frame fallback”

The shipped renderer is a Zellij-style dirty-output layer on top of vt100, not a raw PTY-byte passthrough. Directly forwarding active-pane bytes was rejected because pane content must be offset into an inner rectangle, clipped against borders, dimmed under dialogs, and kept consistent with scrollback and selection state. vt100::Screen::contents_diff was also rejected for the hot path because it assumes a whole-screen origin and cannot express jackin’ pane chrome rules without wrapping every operation.

Hot path — dirty pane body. When any pane PTY emits bytes, the server:

  1. Feeds the bytes into the pane’s vt100::Parser so the Screen stays current. This is cheap and unavoidable; it is what makes tab switches work later.
  2. Marks that pane’s body dirty and lets the render ticker coalesce bursts at roughly 30 fps.
  3. Compares the current visible rows to PaneBodyCache’s last snapshot and emits only changed rows, each with an explicit cursor move and style reset.
  4. Repaints that pane’s border and scrollbar so scrollback count, title, and focus color stay synchronized without repainting unrelated pane bodies.

This is the path that runs during active agent output. It is not a byte-copy, but it avoids the old broad-pane/full-frame redraw and keeps the output surface bounded by rows that actually changed.

Cold path — named full-frame redraw. When the operator switches tabs, switches panes, toggles zoom, opens the palette, splits a pane, attaches a fresh client, browses scrollback, paints selection, or hits a cache/overlay/style case where partial repaint is unsafe, the server runs the full compose path:

  1. Render the top chrome.
  2. For each visible pane in the active tab, serialize every visible row through PaneBodyCache::render_full.
  3. Render pane borders and scrollbars.
  4. Render dialog overlay if one is open.
  5. Position the host terminal cursor at the focused pane’s Screen::cursor_position() and honour Screen::hide_cursor().

Every full redraw carries a FullRedrawReason in crates/jackin-capsule/src/daemon.rs (first-attach, resize, tab-switch, layout-change, split-close, zoom-change, scrollback-movement, pane-clear, dialog-change, selection-repaint, pane-cache-miss, unsafe-partial, and related chrome/style reasons). With JACKIN_DEBUG=1, the renderer logs full vs partial, reason, dirty panes, emitted rows, emitted bytes, and duration in microseconds.

Input handling is a small state machine, not a hardcoded byte intercept:

  • Idle state — every byte from the client is forwarded to the focused pane’s PTY, except the configured prefix (Ctrl+B by default, matching tmux; configurable via env var or future config field).
  • Prefix-awaiting state — entered after the prefix byte arrives. The next key is consumed by the multiplexer and mapped to a command (see table below). If the next byte is the prefix again, a literal prefix byte is forwarded to the pane (Ctrl+B Ctrl+B → literal Ctrl+B). After one command key, state returns to Idle.
  • Palette state — entered by prefix + Space or prefix + :; renders the centred command palette; up/down/enter/escape consumed locally; everything else forwarded to the palette controller.
  • Dialog state — entered by palette actions that need a sub-pick (e.g. New session shows the agent picker). Same key handling as palette state.

Default prefix bindings (subject to refinement during implementation):

Key after prefixAction
cNew tab (opens agent picker)
n / pNext / previous tab
09Jump to tab N
"Split current pane horizontally (top / bottom)
%Split current pane vertically (side by side)
h j k l or arrow keysMove pane focus
zToggle zoom on focused pane
xKill focused pane (collapses tab if last pane)
&Kill focused tab
dDetach client (PTYs keep running)
Space or :Open command palette
Ctrl+LClear focused pane scrollback and send form-feed to request redraw
rForce redraw
u(Phase 4) Open usage and quota overlay for the focused pane
Prefix againLiteral prefix byte to PTY

Ctrl+J is not a default binding. It is reserved as an opt-in alternate prefix that experienced operators may enable when they accept the 0x0A collision risk; the default never assigns it.

The protocol is split by purpose:

  • Control channel — connection-per-request, newline-delimited JSON, for status, session.create, session.kill, session.title, events. This is what the host CLI and the future daemon use. Existing Phase 2 callers keep working.
  • Attach channel — persistent connection per attached client, framed binary. Each frame is [1-byte tag][4-byte big-endian length][payload]. Tags cover hello { rows, cols, spawn, env }, resize { rows, cols }, input_bytes, output_bytes, command { json }, welcome { session_count }, session_list { json }, and shutdown. Payload of input_bytes and output_bytes is raw PTY bytes — no base64, no JSON nesting on the hot path. Payload of command is JSON only because the command vocabulary is open-ended; it is used at human keystroke rate, not 60 Hz.

The hand-rolled base64 in crates/jackin-capsule/src/protocol.rs (split during the Phase 3 rewrite into protocol/attach.rs and protocol/control.rs) is removed. If any future feature ever needs base64 (e.g. embedding a binary blob inside a JSON event), the base64 crate is added — but no current code path needs it.

  • Resize. The client listens for SIGWINCH (via tokio::signal::unix or signal-hook-tokio), reads the new size with crossterm::terminal::size, and sends a resize frame. The server resizes every PTY in the affected layout via MasterPty::resize, calls Screen::set_size on each, and performs a named full-frame replay. The first attempt’s Hello { rows, cols } is preserved as the initial handshake; the missing piece is the runtime propagation.
  • Detach. prefix + d makes the client send a detach frame and exit cleanly, restoring its own terminal. The server drops the client slot but keeps every PTY running. The host can docker exec jackin-capsule to spawn a fresh client which reattaches over the same socket.
  • Reattach. A new client sending hello while one is already attached takes over; the server sends a shutdown frame to the old client (which restores its terminal and exits), then sends a named full replay of the active frame to the new client. Concurrent attach (multiple clients viewing the same session) is out of scope for the first rewrite; the takeover model is sufficient for jackin’ single-operator use case and avoids the multi-client cursor-and-input arbitration zellij has to solve.

The top rows of the host terminal are owned by the multiplexer. Layout left to right:

jackin' [Claude ●] [Codex] [Shell]

The jackin' brand pill uses the bright-green background defined in crates/jackin-capsule/src/statusbar.rs (kept from the first attempt). Each tab is rendered as Label with the rolled-up state glyph appended before the click-region width is computed (the existing bug appends it after, so click regions drift on /). The active tab uses bold white text on a lifted graphite background plus the white underline so it does not visually merge with the brand pill; inactive tabs use dimmed grey.

The palette and prefix shortcuts remain active, but the top chrome does not render a menu or keybinding hint; those columns stay available for tabs.

When the bar is wider than the host terminal, tabs past the right edge are clipped and an overflow indicator () is shown. Tab navigation via prefix bindings still works on clipped tabs.

Phase 4 work documented under Phase 3. The rendering contract is the status-chrome design (Phase 3 territory); the data source is the host daemon (Phase 4 territory). The subsection lives here so the status-bar layout decisions are colocated; nothing in this overlay ships in Phase 3.

Phase 4 should add a read-only usage overlay fed by the host daemon’s Token & Cost Telemetry cache. The Capsule does not poll providers and does not persist credentials. It asks the daemon only for the focused tab/pane’s provider account quota and workspace/session spend, then renders that focused-agent result inside the multiplexer. The compact signal belongs in the same low-noise status chrome as the branch/repository detail line, not as a second standalone widget.

Default shape:

  • Status line: compact always-visible quota glyph for the focused tab/pane’s provider only, placed beside the branch/repository details when that line is present, e.g. main · Codex 12%, feature/auth · Claude 80%, docs · Codex stale, or main · Amp login. The multiplexer must not show a global account average or another agent’s quota while the operator is focused on one agent.
  • Prefix u / palette Usage: modal dialog showing the focused agent’s provider, account label, quota window, remaining percentage, reset countdown, source confidence, cache freshness, current session tokens/cost, workspace tokens/cost, and burn-rate projection when available. The dialog may mention other accounts only through explicit navigation out of the focused-agent view; the default view is focused-provider only.
  • Dialog rows must label provider-authoritative quota separately from locally estimated workspace spend. A local JSONL estimate must never be displayed as the enforced subscription limit.
  • If the daemon is disconnected, the overlay shows usage unavailable: daemon disconnected rather than falling back to in-container polling.
  • If the account needs authentication, the overlay shows needs login / needs secret with the repair action name, while the actual repair flow stays in jackin console, Desktop, CLI, or host bridge.

This keeps the multiplexer useful while the operator is focused inside an agent: before starting an expensive run in a Codex tab, they see Codex headroom only; before starting work in a Claude tab, they see Claude headroom only. The operator opens the dialog only when they need reset timing, confidence, or workspace spend detail.

When the daemon starts:

  1. If the host passes an initial agent argv (the normal case driven by jackin load), the daemon spawns one tab running that agent. This matches today’s operator expectation that jackin load the-architect . immediately drops them into Claude Code.
  2. If no initial agent argv is provided and /jackin/run/agent.toml lists at least one agent, the daemon can come up empty — row 0 brand bar, no tabs, no panes — and render a centred picker hint listing each configured agent plus Shell. The first prefix-action or click chooses the first tab.
  3. If the launch config lists no agents, the daemon spawns a single Shell tab. This is the safe fallback for derived images that lack any agent CLI.

This pattern lets jackin load <role> . keep its existing direct-into-agent UX while leaving room for jackin console to launch the multiplexer with no preselected agent.

The rewrite produces this layout under crates/jackin-capsule/src/:

ModuleRole
main.rsMode selection (PID 1 → daemon, otherwise client subcommand)
daemon.rsServer event loop: tokio select! over PTY events, attach-channel frames, control-channel requests, timer ticks
client.rsAttach-client loop: raw mode, SIGWINCH, prefix-key state machine, attach-channel write
pid1.rsSIGCHLD reaper thread (kept verbatim from the first attempt — correct)
socket.rsUnix socket bind, control-channel accept, attach-channel accept
protocol/control.rsNewline-delimited JSON request/response types for status, session.create, etc.
protocol/attach.rsTag-plus-length binary framing for the attach channel
session.rsOne PTY + one vt100::Screen + state-inference timer; Session::spawn, Session::feed_pty, mode/OSC passthrough state
layout.rsPane tree (kept, border math fixed)
tab.rsTab struct + tab strip ordering
statusbar.rsTop-chrome renderer (brand pill, tabs, active-tab underline)
input.rsPrefix-key state machine + mouse SGR re-encoder
dialog.rsCommand palette + agent picker (kept; revisit ergonomics in Phase 3b)

Files deleted: terminal.rs (replaced by vt100 crate); base64 helpers in protocol.rs (replaced by binary attach channel).

The rewrite lands as one cohesive PR (this one) because the defects are interlocked — fixing one without the others leaves the binary unusable for testing — but the work breaks into reviewable sub-phases. Each sub-phase passes cargo nextest run and cargo clippy --all-targets --all-features -- -D warnings on its own.

  • Phase 3a — Replace VT. Drop crates/jackin-capsule/src/terminal.rs. Add vt100 to crates/jackin-capsule/Cargo.toml. Wire each Session to its own vt100::Parser + Screen. Replace the old per-byte terminal emulator with a compositor-aware row serializer over vt100 cells. Tab switch and reattach replay from the saved screen instead of force_redraw’s PTY-resize hack.
  • Phase 3b — Prefix-key input. Replace the 0x0A palette intercept and the standalone Alt+arrow shortcut with the prefix-key state machine described above. Default prefix Ctrl+B. The command palette still exists, opened by prefix + Space. Update crates/jackin-capsule/src/input.rs and the dialog dispatch in crates/jackin-capsule/src/daemon.rs.
  • Phase 3c — Persistent server + binary attach channel. Remove the process::exit(0) on all-sessions-dead. On agent exit, mark the tab exited, keep the Screen for replay, allow prefix + r to respawn. Replace the JSON + base64 attach-channel output frames with the tag-plus-length binary frames defined above. Keep the existing control channel for status. Remove the hand-rolled base64 helper. Fix the reattach outbound-channel bug.
  • Phase 3d — Resize, mouse, status bar polish. Add client-side SIGWINCH listener and resize frame. Fix the mouse passthrough no-op (re-encode in SGR mouse form and write to the focused PTY). Fix the status bar tab-label width drift. Keep command-palette shortcuts active without rendering a top-bar hint.

Phase 3a is the load-bearing one — it unblocks every TUI agent. Phase 3b removes the worst input bug (Enter being swallowed). Phase 3c makes the daemon robust enough to live inside a long-running container. Phase 3d is polish that the rewrite cannot ship without but that doesn’t change the architecture.

  • VT contract tests. Round-trip alt-screen entry and exit through vt100::Screen and assert that render replay after \x1b[?1049h … \x1b[?1049l reproduces the pre-alt-screen content. Smoke an entire Claude Code-style frame (alt-screen + bracketed paste + application cursor keys) and assert no information loss.
  • Prefix state machine tests. Lone prefix byte → consumed, no PTY output. prefix + prefix → exactly one prefix byte forwarded. prefix + cNewSession command emitted, no bytes forwarded. Plain LF and plain Ctrl+L in the input stream → forwarded to PTY (regression guard for the Ctrl+J = 0x0A bug and clear-pane passthrough). prefix + Ctrl+LClearPane.
  • Persistence tests. Spawn a session, send it exit\n, observe Exited event, assert daemon is still running and the tab is marked exited. Send SIGTERM to the daemon, assert all PTYs are reaped and the daemon exits 0.
  • Reattach tests. Connect a client, attach to a session, write some bytes that update the Screen, disconnect, reconnect. Assert the new client receives a full replay matching the screen state.
  • Resize tests. Send a resize frame, assert every PTY is resized via MasterPty::resize, every Screen via set_size, and the active pane is replayed at the new dimensions.
  • Mouse passthrough tests. Send an SGR mouse press in row 5 — assert it is re-encoded relative to the focused pane’s origin and arrives at the pane’s PTY. Send a press in row 0 — assert it is consumed by the tab-strip hit-test.
  • Control-channel regression tests. The existing status command in crates/jackin-capsule/src/client.rs keeps working unchanged. The host-side smoke tests for the launch path continue to pass without modification.

Terminal compatibility: tmux setting parity

Section titled “Terminal compatibility: tmux setting parity”

The first tmux integration carried five non-default option choices that the operator had pinned over time to keep agent TUIs working. The new multiplexer must reproduce the same observable behaviour from day one — without those equivalents, the rewrite would regress on real workflows the operator already depends on. Each tmux option below is paired with the explicit jackin-capsule behaviour that replaces it.

Extended keys (always-on, advertised to the agent).

Terminal window
tmux set-option -s extended-keys always
tmux set-option -as terminal-features 'xterm*:extkeys'

extended-keys always is required (not on) because agent TUIs do not emit the per-app activation escape that on waits for; without always, Shift+Enter and other extended-key combinations silently fail. terminal-features 'xterm*:extkeys' advertises extkeys support to anything reading terminfo via TERM=xterm*.

jackin’ equivalent. The attach channel transports raw bytes both directions, so the multiplexer never needs to negotiate extended keys at all — it forwards exactly what the outer terminal sends, including the kitty keyboard protocol (\x1b[>1u / \x1b[<u), the xterm modifyOtherKeys sequences (\x1b[>4;2m), and the CSI-u-encoded modifier-plus-key sequences. Pane PTYs use the stable TERM=xterm-256color baseline; the active attach client’s TERM, TERM_PROGRAM, and COLORTERM travel in the Hello handshake and gate outer-terminal enhancements separately. When a TUI queries kitty_keyboard or modifyOtherKeys via DECRQM, the response comes from the agent’s own behaviour, not from the multiplexer — the multiplexer must never strip or rewrite those sequences. The corresponding test asserts that a Shift+Enter keypress (CSI-u sequence \x1b[13;2u) round-trips unchanged from client stdin to the PTY.

Focus events.

Terminal window
tmux set-option -g focus-events on

Agents like Claude Code and Codex rely on focus-in / focus-out events (\x1b[I / \x1b[O) to pause animations, drop polling, and avoid stealing the operator’s attention when the operator switches to another window.

jackin’ equivalent. The client subscribes to focus-event reporting from the outer terminal on attach (\x1b[?1004h) and disables it on detach (\x1b[?1004l). Incoming focus events are routed to the focused pane in the active tab — and only to that pane; an unfocused tab’s agent should not believe it is focused. The server keeps a per-pane “outer-terminal-focused” flag and synthesises \x1b[I / \x1b[O when the operator switches tabs or splits panes, so each agent’s focus state matches what a human would expect when they look at that pane.

OSC passthrough (desktop notifications, progress bars, clipboard, titles).

Terminal window
tmux set-option -g allow-passthrough on

Without OSC passthrough, agent desktop notifications (\x1b]9;…\x07), progress reports (OSC 9;4 — ConEmu progress; OSC 99 — kitty progress), clipboard writes (OSC 52), and window titles (OSC 0/1/2) are silently dropped at the multiplexer boundary.

jackin’ equivalent. OSC sequences emitted by an agent’s PTY are forwarded to the attached client only when the originating pane is the focused pane in the active tab — same rule herdr uses, and the right rule because two backgrounded agents both writing the window title would otherwise fight. The multiplexer parses each OSC payload enough to (a) capture OSC 0/1/2 into the per-pane title field (used by the tab strip and the session.title API), (b) forward OSC 52 unconditionally for the focused pane (clipboard intent is explicit and the user issued it), and (c) forward OSC 9/9;4/99 (notifications, progress) only for the focused pane to avoid notification spam from backgrounded agents. Non-recognised OSC sequences are passed through unchanged. The vt100 crate exposes an osc_dispatch hook that the multiplexer wires into this routing layer.

Zero escape disambiguation delay.

Terminal window
tmux set-option -sg escape-time 0

The default delay tmux waits after ESC to disambiguate ESC from the start of an escape sequence causes misfires in vi-mode navigation inside agent TUIs (especially in the agent’s command palette).

jackin’ equivalent. The input parser keeps a short, configurable JACKIN_ESCAPE_TIME deadline for a bare ESC while preserving complete escape sequences across chunk boundaries. This matches tmux’s disambiguation model while avoiding the original failure mode where ESC [ split across socket reads became a literal Esc plus stray text.

Mouse support, including motion.

Terminal window
tmux set-option -g mouse on

Mouse clicks, drags, and scroll events must reach agent TUIs that opt into mouse reporting via \x1b[?1000h through \x1b[?1006h.

jackin’ equivalent. The client sets the outer terminal to “any-event tracking + SGR mouse” (\x1b[?1003h\x1b[?1006h) on attach and disables alternate-scroll translation (\x1b[?1007l) so wheel gestures arrive as mouse input instead of prompt cursor keys. The multiplexer parses each mouse event: top-chrome events hit-test against the tab strip / menu button, bottom context-row events open GitHub or container details, pane-border events drive resize, and pane-content events are re-encoded relative to the focused pane’s rect origin and written to that pane’s PTY only when the pane opted into mouse reporting. Wheel events in mouse-disabled normal-screen panes drive jackin’ scrollback when vt100 or inline-scroll capture has retained history, including rows preserved before normal-screen clear/redraw and top-anchored scroll-region movement; that same retained history is the only source for pane scroll chrome, for both agent-spawned and shell-spawned panes. Empty-history wheel input stays local to jackin’ rather than becoming cursor-key input. Wheel events in mouse-disabled alternate-screen panes use cursor-key fallback so full-screen programs keep owning their interaction surface. Mouse mode and encoding are tracked with vt100::Screen::mouse_protocol_mode() and mouse_protocol_encoding(), so SGR, default xterm/X10, and UTF-8 mouse modes are distinct on the PTY side.

The setting parity above is a regression contract: each tmux behaviour the operator pinned has a matching jackin-capsule behaviour. The tests under “Tests required” assert each contract separately so future refactors cannot drop one silently.

The operator runs jackin inside Ghostty, the outer terminal. Ghostty’s feature set is wider than xterm’s, so the multiplexer must transport Ghostty’s richer keyboard, color, and graphics protocols intact rather than fall back to a lowest-common-denominator subset.

Kitty keyboard protocol is the load-bearing one. Ghostty implements the kitty keyboard protocol (CSI-u sequences with disambiguated modifiers, full key release reporting, all-event mode). Modern agent TUIs query for it on startup with \x1b[?u. The multiplexer must:

  • Forward the agent’s \x1b[>{flags}u and \x1b[<{n}u push/pop sequences to the outer terminal unchanged, so Ghostty switches the keyboard protocol level to what the agent asked for.
  • Forward Ghostty’s responses (encoded keys, including modifier-plus-key combinations that legacy terminals cannot encode at all — Ctrl+Shift+Tab, Alt+/, etc.) to the agent’s PTY unchanged.
  • Not intercept any byte inside a CSI-u sequence. The prefix-key state machine recognises the configured prefix byte at sequence boundaries only.

True-color and color queries. Ghostty supports 24-bit color and replies to \x1b]10 / \x1b]11 (foreground/background queries) and \x1b]4;<n>;?\x07 (palette queries). Agents query these to theme themselves to the outer terminal’s palette. The multiplexer forwards both directions verbatim and uses vt100::Screen’s 24-bit color cell model so attribute fidelity is preserved through tab-switch replay.

Kitty graphics protocol. Ghostty supports the kitty graphics protocol (APC escape sequences for inline images). Agents that render diagrams or screenshots inline emit \x1b_G…\x1b\\. The multiplexer treats APC sequences exactly like OSC: forward unchanged for the focused pane only; ignore for backgrounded panes (an image emitted by a background pane should not appear in another pane’s region). The vt100 parser delivers APC payloads via the hook / put / unhook callbacks; the multiplexer captures the bytes and re-emits them to the client when that pane is the focused pane being rendered. APC sequences are explicitly not “replayed” on tab switch — only what the agent currently has on screen matters, and Ghostty handles redraw of already-emitted images itself.

Bracketed paste. Ghostty wraps pasted text in \x1b[200~ / \x1b[201~ when the agent has enabled bracketed paste (\x1b[?2004h). The multiplexer must forward these wrappers untouched to the pane’s PTY. This is one of the DEC private modes the first hand-rolled VT silently dropped; vt100::Screen::bracketed_paste_mode() tracks the agent’s intent and the multiplexer respects it on the input direction.

Synchronised output (BSU/ESU). Ghostty supports the “synchronised output” mode (\x1b[?2026h/l — also called BSU/ESU). Agents that draw in many small writes wrap each frame in synchronised-output markers so the outer terminal renders the frame atomically and the operator does not see partial paints. The multiplexer must forward both directions of these markers when the focused pane is the active one. For non-focused panes the markers are still consumed by vt100’s parser (so the Screen updates atomically internally) but no client output is emitted, which is the desired behaviour anyway — backgrounded panes have no client surface to paint.

OSC 52 clipboard. Ghostty supports OSC 52 clipboard writes. The multiplexer forwards them per the OSC passthrough rules above (focused pane only). This gives agents that copy code blocks or commands a real clipboard hand-off — without it, the “copy to clipboard” button on Claude Code’s output silently does nothing.

OSC 8 hyperlinks. Ghostty renders \x1b]8;;<url>\x1b\\ clickable. Forwarded unchanged for the focused pane.

Theme query (OSC 10/11/4) and dark/light mode (DCS $q… for tmux is irrelevant here; Ghostty exposes mode via OSC). Forwarded unchanged. The multiplexer does not interpret the outer terminal’s theme; agents that want to adapt query directly.

Sixel graphics. Not currently used by the in-scope agent TUIs but Ghostty supports it. Treated identically to kitty graphics: forwarded for the focused pane, ignored for backgrounded panes, not replayed on switch.

The unifying rule across every Ghostty feature is: the multiplexer’s job is to be a transparent pipe for whatever escape sequences the agent and the outer terminal negotiate, while owning the top chrome and the prefix-key boundary. Anything else it tries to interpret will eventually break a feature Ghostty supports that jackin’ does not yet know about. The vt100 crate’s job is to keep enough state to replay panes on switch and detect activity for the agent-state heuristic — it is not an oracle for what to pass through, and the multiplexer should not gate features on vt100 understanding them.

A regression test runs an end-to-end smoke against Ghostty’s reported features: emit \x1b[?u (kitty keyboard query) from a synthetic PTY, capture the client output, assert the byte sequence reaches the outer-terminal-facing stream unchanged. Repeat for OSC 52, OSC 9, OSC 8, BSU/ESU, and the kitty graphics protocol header. The test does not require an actual Ghostty instance — it only asserts the bytes survive the multiplexer round-trip.

  • The host-side docker run mount of /jackin/run/<instance-id>.sock:/jackin/run/jackin.sock. The socket location, ownership, and lifetime are unchanged.
  • The derived image build pipeline that downloads or builds jackin-capsule and installs it at /jackin/runtime/jackin-capsule. src/bin/build_jackin_capsule.rs stays as-is.
  • The Phase 2 status API on the control channel. Existing host callers do not need to change.
  • The /jackin/run/agent.toml launch-config contract between the host launcher and the daemon, plus the initial-agent argv contract from the host launcher into PID 1.
  • The PID 1 zombie-reaping logic in crates/jackin-capsule/src/pid1.rs.
  • The four-state agent-state model (working / blocked / done / idle) and its roll-up rules.

Phase 2: add socket mount to docker run; update inspect_agent_sessions in src/runtime/attach.rs to connect to socket instead of docker exec tmux list-sessions.

Phase 3: remove tmux from the derived image (docker/construct/Dockerfile); replace all docker exec tmux ... call sites in src/runtime/attach.rs and src/runtime/launch.rs with socket API calls.

Herdr (ogulcancelik/herdr) is the closest public reference for the multiplexer server concept. It is a single Rust binary, built for exactly the same problem: managing multiple AI coding agents with per-project workspace grouping, four-state status tracking, a Unix socket API, and session persistence across client detach.

Herdr cannot be embedded in or linked from jackin’: it is AGPL-3.0, which conflicts with jackin’ Apache-2.0 license. Any reimplementation in jackin-capsule must be written from scratch. Ideas, concepts, and algorithms are not protected by copyright; the code that expresses them is.

The deeper architectural difference: Herdr wraps bare host processes. jackin’ wraps Docker containers. Herdr’s status heuristics read foreground process state and PTY output from the agent’s own terminal. When the agent runs inside a container and Herdr sees docker attach rather than the agent, the heuristics degrade. jackin-capsule runs inside the container — it sees the agent’s PTY output directly, making the same heuristic approach reliable.

Two-stage done state. Herdr distinguishes Done (work finished, not yet reviewed) from Idle (reviewed or empty). jackin’ AgentStatus model should adopt the same split for the same reasons: the autonomous task queue must not refill a done slot until the operator acknowledges, and the desktop app needs a distinct “ready for review” surface. Whether this lands as a ReadyForReview enum variant or as Idle + a separate acknowledgement flag is a design detail for Phase 3.

Workspace-level status roll-up. Herdr’s workspace sidebar shows the most urgent child status at the workspace level — one blocked agent makes the whole workspace show Blocked. The attention-priority order jackin’ should use is blocked > done > working > idle > unknown: an unseen finished pane is ready for the operator now and should outrank merely background working panes. The console Instances panel should adopt this roll-up rule once agent-state events from Phase 3 are available.

Notification suppression when already looking. Herdr suppresses sound and toast notifications when the relevant pane is focused. jackin’ attention prompts design should include the equivalent: if the console currently has that workspace row focused, downgrade or skip the OS notification. The precise condition needs design (console selection is coarser than Herdr’s pane-level focus).

Sound escalation as opt-in. Herdr ships silent toast by default, with opt-in sound escalation after N seconds. This pattern is validated by Herdr’s live usage and should stay in the attention prompts V1 design.

Blocking wait semantics on the socket. Herdr’s socket API lets callers block until a status transition (herdr wait agent-status 1-1 --status done). This is a better interface than polling for automation scripts and for the daemon’s event subscription. The events stream in Phase 3 should support the same pattern: a subscriber blocks on the stream and receives the event when the transition occurs.

Layered state authority. Herdr detects agent state by combining foreground process state, visible terminal-screen signals, and semantic integration reports. jackin-capsule will use the same concept — with the advantage that it runs inside the container and reads the agent’s real PTY and process tree directly rather than through a docker attach wrapper. Output activity remains a weak working signal, but silence alone must not mean blocked; the dedicated agent runtime status authority owns the full arbitration model.

  • Herdr as the session substrate. Herdr manages bare-host PTY processes. jackin-capsule needs a PTY multiplexer that runs inside a container, is statically linked, and has no host-side dependencies. Herdr’s internals are the wrong shape for this.
  • Herdr’s theme system. jackin’ TUI uses a fixed palette. Themes are not a priority.
  • Herdr’s SSH remote tunneling. Herdr’s transparent PTY forwarding over SSH has a different threat model than the explicit operator-controlled tunneling planned for jackin-remote.
  • Herdr’s wire protocol verbatim. The socket API vocabulary should be designed for jackin’ data model. Herdr is inspiration for the wait semantics, not a drop-in spec.

Phase 1 — Cleanup gap closed (shipped — bash supervisor + host-side teardown)

Section titled “Phase 1 — Cleanup gap closed (shipped — bash supervisor + host-side teardown)”

Historical pre-Capsule state: containers used tmux as the session layer. That temporary arrangement used docker exec tmux new-session to create sessions, docker exec tmux attach-session to reconnect, and docker exec tmux list-sessions to query what was running. The bash supervisor wrapped tmux from the outside and had no understanding of what happened inside sessions. It worked as a minimal path, but gave jackin’ no visibility into agent state, no structured event stream, and no way to manage sessions without shelling into the container.

What shipped in Phase 1: docker/runtime/supervisor.sh (since deleted) polled tmux list-sessions every second rather than watching the socket file disappear. Using tmux list-sessions was robust to stale socket files left behind when tmux crashed or was killed without cleanup — a plain [ -S socket ] file-existence check would loop forever on a dead socket. When tmux list-sessions returned non-zero (server gone or no sessions), the supervisor exited 0. A 60-second startup grace period waited for the socket file to appear before entering the monitor loop, preventing a premature exit before the first docker exec tmux new-session created it.

Two companion fixes in the host-side Rust code were required to make the cleanup path actually fire. src/isolation/finalize.rs (finalize_foreground_session) now calls has_tmux_sessions when the container is still Running after docker exec returns: an empty session list means supervisor lag, not a detach, so the code falls through to finalize_clean_exit and sweeps isolation worktrees normally instead of returning Preserved. src/runtime/launch.rs now tears down the DinD sidecar and Docker network in every branch where the container has exited — including crashes (non-zero exit) — rather than preserving them for a jackin hardline reconnect that cannot work without a live DinD. When docker exec returns and the container is still Running with an empty session list, teardown fires immediately without a redundant re-query (the session check in finalize_foreground_session already confirmed no sessions are present). Only a genuinely live detach (container still Running AND sessions confirmed present) keeps the DinD alive for reconnect.

Why a purpose-built multiplexer made sense: researching Herdr — a Rust terminal multiplexer purpose-built for AI coding agents — confirmed that the missing piece was a session layer that understands AI agents natively. Herdr tracks four agent states (blocked / working / done / idle) with zero configuration, exposes a Unix socket API for session control, and streams status-change events. It is designed for bare host processes (not containers) and is AGPL-3.0, so it cannot be embedded in jackin’. But the core concept is exactly right: a multiplexer that knows about agents, not just terminals. jackin-capsule is that in-container multiplexer/control plane, where it can see the agent’s PTY output directly and expose it through a Unix socket that the host and desktop app both drive.

Phase 2 — Rust binary skeleton + Unix socket status command (structured session inventory)

Section titled “Phase 2 — Rust binary skeleton + Unix socket status command (structured session inventory)”
  • Create crates/jackin-capsule/ workspace member.
  • Implement PID 1 bootstrap: zombie reaping via SIGCHLD, SIGTERM/SIGINT → clean exit.
  • Implement tmux socket watch via inotify (replaces bash polling). Exit 0 when socket deleted.
  • Implement Unix socket listener and status command.
  • Add CI job: cross-compile for linux/amd64 and linux/arm64, publish to GitHub Releases.
  • Add socket mount to docker run in src/runtime/launch.rs.
  • Update inspect_agent_sessions in src/runtime/attach.rs to query socket instead of docker exec tmux list-sessions.
  • Update hardline --inspect and console session inventory to use socket path.
  • Remove docker/runtime/supervisor.sh (pre-release; no migration shim needed).

Phase 3 — In-container multiplexer (replace tmux — rewrite in flight)

Section titled “Phase 3 — In-container multiplexer (replace tmux — rewrite in flight)”

The first Phase 3 attempt landed PID 1 ownership, the PTY layer, a binary pane tree, a status bar, a Unix socket attach channel, and the dockerfile changes that drop tmux from the derived image. That work is preserved where it is sound (the layout tree, the SIGCHLD reaper, the host-side mount changes, the build pipeline). The pieces that are unusable for modern TUI agents — the hand-rolled VT, the 0x0A palette shortcut, the daemon’s exit-on-last-session, the mouse no-op, the JSON-plus-base64 hot path — are rewritten against the architecture defined in “Multiplexer architecture (Phase 3)” above. The rewrite lands as one cohesive change in the active Phase 3 pull request, broken into four reviewable sub-phases:

  • Phase 3a — Replace the VT. Delete crates/jackin-capsule/src/terminal.rs (deleted in the Phase 3 rewrite — replaced by session.rs). Add vt100 to crates/jackin-capsule/Cargo.toml. Hold one vt100::Parser plus vt100::Screen per Session. Switch the tab/pane switch and reattach paths to full replay from the saved screen. Honour Screen::hide_cursor and Screen::cursor_position when positioning the host cursor.
  • Phase 3b — Prefix-key input. Replace the 0x0A palette intercept with the prefix-key state machine. Default prefix Ctrl+B. Default bindings as listed in “Input model: prefix key” above. Add the “Escape forwards immediately” rule (zero-delay disambiguation).
  • Phase 3c — Persistent server + binary attach channel. Remove the std::process::exit(0) on all-sessions-dead. On agent exit, mark the tab exited, keep the screen for replay, allow respawn via prefix bindings. Replace the JSON-plus-base64 attach-channel output with the tag-plus-length binary format defined in “Wire protocol”. Drop the hand-rolled base64 from crates/jackin-capsule/src/protocol.rs (split during the Phase 3 rewrite into protocol/attach.rs and protocol/control.rs). Fix the reattach outbound-channel drop.
  • Phase 3d — Resize, mouse, terminal compatibility, status bar polish. Add the client-side SIGWINCH listener and resize frames. Fix the mouse passthrough no-op. Wire focus events, OSC/APC passthrough rules, BSU/ESU, kitty-keyboard pass-through, and OSC 52 routing per “Terminal compatibility: tmux setting parity” and “Ghostty compatibility” above. Reassert a multiplexer-owned outer terminal title from the workspace and branch / pull request context. Poll local Git branch state separately from cached GitHub PR/check lookups so branch changes update the status bar quickly without aggressive gh calls. Fix the status bar tab-label width drift. Keep command-palette shortcuts active without rendering a top-bar hint.

Phase 3 also keeps the control-channel work the first attempt completed:

  • Implement the agent runtime status authority: raw working / blocked / idle / unknown, derived done, foreground-process detection, semantic runtime reports, visible-screen signals, stale-report arbitration, and stuck diagnostics. Silence alone is not a blocker signal.
  • Implement the two-stage done / idle split and the operator-acknowledgement mechanism.
  • Expand control-channel API: session.create, session.kill, session.title, events.
  • Implement event stream: session-started, session-ended, agent-state-changed.
  • Implement per-tab status roll-up (blocked > done > working > idle > unknown) on the host side, consuming events from the control channel.
  • Remove tmux from the derived image (docker/construct/Dockerfile).
  • Replace all docker exec tmux … call sites in the host CLI with control-channel calls.
  • Update console session panel to consume agent-state-changed events rather than polling.
  • Update attention prompts to subscribe to blocked state events rather than doing PTY polling from the host.

session.attach is intentionally not on the control channel — attach is the persistent binary attach channel defined in “Wire protocol”. The first attempt accidentally collapsed the two into one socket; the rewrite separates them.

Phase 4 — Daemon integration and desktop app bridge (deferred)

Section titled “Phase 4 — Daemon integration and desktop app bridge (deferred)”
  • Daemon connects to each running container’s socket and maintains a live session index across all containers.
  • Desktop app reads from the daemon’s aggregated view; each jackin-capsule socket is the per-container data source. The multiplexer IS the server; the desktop companion reads what is happening inside the container through it.
  • Advanced commands: session snapshots, resource usage per session, log streaming.
  • See jackin’ daemon and jackin’ Desktop Agent Hub.
  • Agent Orchestrator Research Program — Herdr is evaluated there as the strongest prior art for the multiplexer vision. The full comparative table (Herdr vs. jackin’ values) lives in the research overview.
  • Console agent session control — Phase 4 of that item unblocks once Phase 2 of this item ships: the binary exposes live session state, eliminating manifest-snapshot reconciliation.
  • Agent runtime status authority — the agent-state-changed event stream from Phase 3 is the delivery mechanism for blocked/working/done/idle/stuck indicators in the console and hardline.
  • Agent attention promptsblocked events replace PTY polling from the host; done events trigger the “ready for review” notification path.
  • jackin’ daemonjackin-capsule is the per-container endpoint the daemon subscribes to. Phase 4 of this item and the daemon’s container-watch phase are designed together.
  • jackin’ Desktop Agent Hub — Phase 3 of this item is the prerequisite: the desktop app’s live session view is driven by jackin-capsule socket events aggregated by the daemon.
  • jackin-remote — remote containers will need a jackin-capsule socket accessible over the tunnel; the socket mount and auth model need to account for the remote path.