jackin' Capsule: In-Container Control Plane
Status: Partially implemented — Phases 1–3 shipped, Phase 4 (host daemon integration and Desktop Agent Hub bridge) remains open. Phase 3 landed the jackin’ Capsule: an in-container PTY control plane built on the vt100 crate with a Zellij-style dirty-row renderer, a tmux-style prefix-key model (Ctrl+B opt-in via JACKIN_PREFIX, including prefix Ctrl+L clear-pane), a persistent PID 1 daemon that exits cleanly when the last session ends, binary tag+length attach framing on the hot path, single-client takeover, mode-state restore on focus swap, OSC passthrough (52/9/2/8) gated to the focused pane, top chrome with a brand pill and hover-lifted tab strip, and a bottom branch/PR context bar for non-default branches with hover-lifted click targets for GitHub context and container details.
The shipped-Phase-3 design and operator-visible behaviour now live in:
- jackin’ Capsule (architecture, lifecycle, distribution, input routing)
- Multiplexer design rules (passthrough contract, verification checklist)
commands/console.mdxandcommands/hardline.mdx(operator-facing prefix keys, palette, tabs)
The remainder of this page is the original Phase 1–3 design rationale, kept for the design-history record. Some lower sections intentionally describe the problem statement and rejected intermediate designs in past-tense context. The canonical “what the Capsule does today” pages are the three links above.
Problem
Section titled “Problem”Before the Capsule rewrite, the container supervisor was a bash wait loop (docker/runtime/supervisor.sh, deleted alongside this Phase) that kept the container alive while agent sessions ran via docker exec. It had two immediate limitations and one deeper architectural gap.
Last-session cleanup does not fire (resolved in Phase 1). When a tmux session exits, the supervisor was keeping the container running. Because the container state is Running rather than Stopped, the host jackin cleanup path (finalize_foreground_session, container teardown, DinD/network/certs removal) would not fire automatically. The operator had to explicitly run jackin eject to clean up a container after all sessions end.
Session inventory requires shelling out. Querying which sessions are active requires docker exec <container> sh -c 'tmux list-sessions ...' from the host. There is no structured interface for the host CLI or the future jackin’ daemon to ask “what is running in this container right now?”
tmux is not designed for the container-per-agent model. tmux has no concept of agent state, no structured control plane, and no event stream. Every interaction requires docker exec tmux ... round-trips. jackin’ cannot ask “is this agent blocked or still working?” without reading raw terminal output from the outside. This is the root reason observability, attention prompts, and desktop app integration are hard to build on top of the current architecture. A purpose-built process inside the container — one that owns the session lifecycle from the start — can expose exactly the information jackin’ needs over a structured protocol.
Vision
Section titled “Vision”Each Capsule-managed role container runs jackin-capsule as PID 1. It is the Capsule control plane: it manages PTY sessions directly (replacing tmux), tracks session state, renders the in-container multiplexer, and exposes a Unix socket API that the host CLI, the future jackin’ daemon, and the jackin’ desktop app can drive.
The container becomes a self-contained server. The operator’s terminal, the console TUI, and the desktop companion all talk to the same control plane:
- Spawn a session (new agent or shell) — without shelling into the container
- Kill a session — terminate a specific agent or shell
- Query status — which sessions are running, which agent is blocked, which is done and waiting for review
- Get session title — the current terminal title or process name for each session
- Attach to a session — connect a PTY client to a running session
- Subscribe to events — session started, session ended, agent state changed
This makes jackin-capsule the data source for all future observability features: attention prompts, live agent state in the console, the desktop app’s session panel, and the daemon’s reactive instance index.
Design
Section titled “Design”Binary name and location
Section titled “Binary name and location”The binary is named jackin-capsule and lives at /jackin/runtime/jackin-capsule inside every Capsule-managed role container. It replaces the deleted docker/runtime/supervisor.sh bash wait loop as PID 1.
Workspace member: crates/jackin-capsule/. Produces one Linux binary. The crate is independent from the main jackin crate — it shares protocol shapes through jackin-protocol, so the host and Capsule compile against the same control-channel request and reply types.
Distribution
Section titled “Distribution”The binary is compiled in CI for the container target (linux/amd64, linux/arm64) and published to the jackin GitHub Releases alongside the main CLI binary. The version is always pinned to the same version as the jackin CLI that built the derived image — they are released together.
Before the derived image build, jackin resolves the jackin-capsule binary for the target architecture (JACKIN_CAPSULE_BIN override, cache hit, or GitHub Release download), writes it to the build context, and the derived Dockerfile installs it at /jackin/runtime/jackin-capsule. The download is cached per jackin version.
PID 1 responsibilities
Section titled “PID 1 responsibilities”As PID 1, jackin-capsule must:
- Reap zombie children. Processes whose parent exits become children of PID 1. The binary must call
waitpid(-1, WNOHANG)in a loop onSIGCHLDto reap them.tokiodoes not do this automatically. - Forward signals cleanly.
SIGTERMfromdocker stop→ exit 0.SIGINT→ exit 0. Both trigger the session-end path. - Never crash on unexpected input. PID 1 death kills the entire container. All error paths either log and continue or exit deliberately.
Unix socket interface
Section titled “Unix socket interface”The binary listens on a Unix domain socket at /jackin/run/jackin.sock inside the container. The host bind-mounts a per-instance socket directory:
docker run ... -v ~/.jackin/sockets/<container-name>:/jackin/run ...The daemon creates jackin.sock inside that directory, so the host-side path is ~/.jackin/sockets/<container-name>/jackin.sock.
Protocol: The same socket carries two channel types selected by the first byte. The control channel uses 0x00, a 4-byte big-endian length prefix, and a JSON body for one-shot status and snapshot requests. The attach channel is persistent and uses tag-plus-length-prefix binary framing for hot-path PTY bytes. The two channels share the socket so the host needs only one mount. See “Wire protocol” under “Multiplexer architecture (Phase 3)” below for the binary frame layout.
Session management API
Section titled “Session management API”The original target API was:
| Method | Phase | Description |
|---|---|---|
status | 2 | List all sessions with name, agent type, created-at, and state |
session.create | 3 | Spawn a new agent or shell session |
session.kill | 3 | Terminate a session by ID |
session.title | 3 | Read the current terminal title or process name for a session |
session.attach | 3 | Return a PTY attachment handle so the client can connect |
events | 3 | Upgrade connection to a streaming event channel |
Deferred event stream: session-started, session-ended, all-sessions-ended, agent-state-changed {session_id, state}. The shipped Phase 3 control channel exposes one-shot status and snapshot; streaming state belongs to Phase 4 daemon integration.
Agent state model
Section titled “Agent state model”Each session tracks one of four states. These states are inferred from PTY output activity and foreground process state — no agent hooks or configuration required.
| State | Meaning |
|---|---|
working | Output flowing or foreground process actively running |
blocked | Silent for N seconds with a foreground process present — waiting for operator input |
done | Work finished; the operator has not yet reviewed the output |
idle | Reviewed or no work in progress |
The two-stage done / idle split is important: a done slot should not be automatically refilled by the autonomous task queue or cleaned up from the console until the operator has acknowledged the output. This distinction drives the “ready for review” indicator in the desktop app and the dispatch logic in future autonomous queue work.
Session lifecycle tracking
Section titled “Session lifecycle tracking”Phase 2 (Rust binary, tmux still present): the binary watches the tmux server socket at /tmp/tmux-<uid>/default via inotify. Socket deleted → all sessions ended → exit 0. This replaces the 1-second bash polling loop from Phase 1.
Phase 3 (multiplexer phase): the binary owns session lifecycle directly. No tmux. The shipped daemon removes exited panes, keeps other sessions running, and exits cleanly when no live sessions remain so the host cleanup path can tear down the role container, DinD sidecar, cert volume, and network. An earlier Phase 3 design considered keeping exited tabs around until SIGTERM; that was dropped because it weakened the last-session cleanup contract.
Why replace tmux
Section titled “Why replace tmux”- No structured API — every tmux interaction from the host requires
docker exec tmux ..., a subprocess round-trip with string output that must be parsed. - No agent-state awareness — tmux reports session names and windows; it has no concept of whether the process inside is blocked, working, or done.
- No event stream — the host cannot subscribe to “a session ended” without polling.
- tmux adds binary size to the image, startup overhead, and a dependency jackin’ does not control.
A purpose-built multiplexer in jackin-capsule gives jackin’ a clean control plane: structured socket, typed events, agent-state inference, and no external process to coordinate with.
Multiplexer architecture (Phase 3)
Section titled “Multiplexer architecture (Phase 3)”This section defines the multiplexer architecture in detail. The first Phase 3 attempt is being replaced wholesale — every module is either rewritten or deleted — so this section is the spec the rewrite implements rather than a description of code that exists today.
Why the first attempt is being rewritten
Section titled “Why the first attempt is being rewritten”The first attempt produced a working PID 1 binary that opens PTYs, draws a status bar, manages a binary pane tree, and frames messages over a Unix socket. The shape was right; the execution was not. Five categories of defect make the result unusable for modern TUI agents and force a ground-up rewrite rather than incremental fixes:
- Hand-rolled VT emulator without alt-screen support.
crates/jackin-capsule/src/terminal.rs(deleted in the Phase 3 rewrite — replaced bysession.rs) implements thevte::Performtrait but itscsi_dispatchignores thehandlactions and never inspects the?intermediate, so every DEC private mode is silently dropped — including\x1b[?1049h(alternate screen),\x1b[?25h/l(cursor visibility),\x1b[?1(application cursor keys),\x1b[?2004(bracketed paste), and\x1b[?1000-1006(mouse modes). Modern TUIs enter alt-screen, expect application cursor keys, and round-trip bracketed paste; without them the visible pane is at best garbled and at worst blank. Repairing the emulator would mean reimplementing most ofvt100oralacritty_terminal. A real VT state library is the right answer. Ctrl+J = 0x0A = line feed.crates/jackin-capsule/src/input.rsintercepts byte0x0Aand opens the command palette. That byte is identical to a literal newline; any LF in pasted text, in canonical-mode input, or from a TUI emitting\nopens the palette and is consumed. Operators effectively cannot send Enter to the agent on any terminal that doesn’t strictly send\rfor the return key. The fix is a tmux-style prefix-key state machine, not a single hardcoded byte.- Daemon exits when all sessions die.
crates/jackin-capsule/src/daemon.rscallsstd::process::exit(0)oncesessions.values().all(|s| !s.alive)becomes true. Because the daemon is PID 1, that one call brings down every other process in the container and ends the container itself. An agent that crashes during the first second of launch — common while iterating on roles — looks to the operator like “jackin’ keeps exiting on me.” The daemon must persist untilSIGTERM. - Mouse passthrough is a literal no-op. The
InputEvent::MousePress { .. }branch incrates/jackin-capsule/src/daemon.rsoutside row 0 readslet _ = session;and returnsNone; the mouse event is discarded. The branch above it correctly handles row-0 (tab strip) clicks. The body for “anywhere else” was never written. - Hot-path framing is full-screen redraw, base64, and JSON. Every PTY chunk runs
compose_framewhich rebuilds the entirerows × colscell grid as ANSI, encodes the result with a hand-rolled base64 (also a violation of the “prefer libraries” rule inAGENTS.md), wraps the base64 in a JSONServerMsg::Output, and length-prefixes the result. An 80×24 TUI redrawing at 60 Hz amounts to megabytes per second of structurally pointless re-encoding between two processes that share a kernel. The right shape is: send PTY bytes through almost unchanged on the hot path, and only run the full-screen compose path on tab switch, pane switch, or client reattach.
The same module also leaks several smaller defects — single-tab borders are always drawn, status-bar tabs append ●/○ after the click-region width is computed (so click targets drift), the reattach path drops the outbound channel sender, the PTY writer task uses rt.block_on(input_rx.recv()) inside spawn_blocking, and the client never propagates SIGWINCH — but the five categories above are the load-bearing ones. The rewrite addresses all of them by changing the architecture, not by patching the existing modules.
Architectural reference: zellij
Section titled “Architectural reference: zellij”Zellij is the closest architectural reference for the rewrite. It is Apache-2.0 (we may study and adapt its design freely), it is written in Rust, and it solves the same problem at much larger scope. The pieces of zellij’s design jackin adopts:
- Client-server split over a Unix domain socket. The server owns all PTYs, all VT state, all tabs/panes; the client owns user input and rendering. Multiple clients can attach to one server. Detach and reattach leave PTYs running.
- Typed instruction bus. Inside the server, dedicated threads (PTY I/O, screen state, plugins) talk over MPSC channels carrying typed enums (
PtyInstruction,ScreenInstruction). jackin’ server is single-threaded enough that the bus is overkill, but the typed-enum-over-channel shape is exactly what the existing tokioselect!loop incrates/jackin-capsule/src/daemon.rsshould converge on. - Per-pane VT state, replayed on switch. The server keeps a
vt100::Screenfor every pane, including non-visible ones. PTY output is fed to the screen regardless of focus; switching tabs replays the target pane from the saved screen rather than asking the application to redraw. - Binary protocol on the hot path. Zellij uses protobuf; jackin’ uses simpler tag-plus-length-prefix binary framing. The point is not protobuf — it is “don’t put base64-inside-JSON on the hot path.”
Zellij carries scope jackin explicitly does not need (multi-client collaboration, plugins via WASM, scrollback search, copy-mode regex, etc.). The rewrite cherry-picks the structural pieces and leaves the rest.
Concepts to borrow from herdr (license-safe restatement)
Section titled “Concepts to borrow from herdr (license-safe restatement)”The Herdr study under “Prior art: Herdr” below lists the conceptual borrows for the agent-state model, status roll-up, notification suppression, sound escalation, blocking wait semantics, foreground-process ownership, screen heuristics, and semantic integration reports. The multiplexer rewrite adds three structural borrows on top of those — drawn from Herdr’s UI shape, not its code:
- Top-of-screen chrome. The top chrome renders
jackin'on the left followed by one tab per active jackin session. Active tab gets a distinct graphite background plus white underline/bold treatment; inactive tabs are dimmed; tab labels include the rolled-up state glyph. Operators sharing a screen recognise the brand pill at a glance without confusing it for the selected tab. - Empty initial state when no agent is preselected. When no initial agent argv is provided at daemon launch, the multiplexer can come up with the brand header, zero tabs, and a centred hint listing the agents from
/jackin/run/agent.tomlplusShell. The operator picks one with the prefix key and the first tab spawns into that selection. When the host passes an initial agent argv, the daemon spawns the first tab with that agent automatically — matching the historical direct-into-agent UX. - Per-tab “most urgent” state roll-up. A tab containing any
blockedpane isblocked; otherwise anydonepane makes itdone; otherwiseworking; otherwiseidle. Same urgency order herdr uses. This drives the tab-strip glyph and feeds the futureagent-state-changedevent stream. Once a pane reachesblocked, the attention glyph stays visible until explicit operator keyboard input reaches that pane; incidental PTY output is not enough to clear it.
These are interface borrows. Herdr’s internal layout (workspaces, sidebar, pane-focus suppression rules, ghostty embedding, kitty graphics passthrough, plugin hooks) is not part of jackin’ scope and is not copied.
Tab and pane model
Section titled “Tab and pane model”The model is deliberately a strict subset of tmux:
- A session is one running
jackin-capsuledaemon. There is exactly one session per jackin instance and it lives for the lifetime of the container. - A tab is a named top-level container. Each tab has a label (the agent’s display name when first launched) and a pane tree. The tab strip in row 0 lists every tab in creation order; the operator switches with prefix bindings or by clicking.
- A pane is one PTY plus the
vt100::Screenthat mirrors it. A tab starts as a single pane; splitting a pane creates a sibling. The pane tree is a binary tree ofHSplit { left, right, ratio }/VSplit { top, bottom, ratio }/Leaf(pane_id). The existing tree incrates/jackin-capsule/src/layout.rsalready has the right shape; the rewrite keeps it and fixes border-math edge cases (no border when a tab has a single leaf; bottom row math respects the status-bar offset).
The model has no window layer between tab and pane even though tmux does. Tmux’s window concept exists primarily because tmux predates the modern tabs-and-splits idiom; jackin’ starts from that idiom and does not need the historical layer. The model has no workspace layer above session even though herdr does. Workspaces only make sense when one server hosts many projects at once; jackin’ server hosts exactly one container.
VT state via vt100
Section titled “VT state via vt100”The hand-rolled emulator in crates/jackin-capsule/src/terminal.rs (deleted in the Phase 3 rewrite — replaced by session.rs) is deleted. Replaced by vt100 — a maintained Rust crate explicitly designed for “tmux-like” use cases. It gives the rewrite all of the following for free:
- Full DEC private mode handling — alt-screen, cursor visibility, application cursor keys, bracketed paste, mouse modes.
- A
Screentype with scrollback, attributes (fg, bg, bold, italic, underline, reverse, blink, strike, dim), and cursor state. Screen::contents_formatted()andScreen::contents_diff()— emit ANSI for whole-screen origins. jackin’ studied them, but pane offsets, borders, dimming, selection, and overlays require a compositor-aware row serializer incrates/jackin-capsule/src/render.rs.Screen::set_size(rows, cols)— correct resize semantics including alt-screen and reflow handling.
The Apache-2.0 license is compatible with jackin-capsule’s Apache-2.0. The crate is well-maintained, has no transitive dependency surprise, and is the canonical choice in the Rust ecosystem for this job — exactly the pattern AGENTS.md’s “Prefer libraries over hand-rolled parsers” rule demands.
jackin’ temporarily pins a forked vt100 commit while upstreaming Screen::clear_scrollback() and CSI 3J support. That behavior belongs in vt100 because the crate owns the in-memory grid and scrollback state, exposes Screen::set_scrollback, and advertises itself for screen / tmux-style applications. jackin’ should not fake scrollback erasure in the compositor or use RIS (ESC c) because both approaches conflate terminal state with pane rendering.
alacritty_terminal is a heavier alternative offering scrollback search and selection state. jackin’ first rewrite does not need either; vt100 is the right scope.
Render model: dirty pane bodies and full-frame fallback
Section titled “Render model: dirty pane bodies and full-frame fallback”The shipped renderer is a Zellij-style dirty-output layer on top of vt100, not a raw PTY-byte passthrough. Directly forwarding active-pane bytes was rejected because pane content must be offset into an inner rectangle, clipped against borders, dimmed under dialogs, and kept consistent with scrollback and selection state. vt100::Screen::contents_diff was also rejected for the hot path because it assumes a whole-screen origin and cannot express jackin’ pane chrome rules without wrapping every operation.
Hot path — dirty pane body. When any pane PTY emits bytes, the server:
- Feeds the bytes into the pane’s
vt100::Parserso theScreenstays current. This is cheap and unavoidable; it is what makes tab switches work later. - Marks that pane’s body dirty and lets the render ticker coalesce bursts at roughly 30 fps.
- Compares the current visible rows to
PaneBodyCache’s last snapshot and emits only changed rows, each with an explicit cursor move and style reset. - Repaints that pane’s border and scrollbar so scrollback count, title, and focus color stay synchronized without repainting unrelated pane bodies.
This is the path that runs during active agent output. It is not a byte-copy, but it avoids the old broad-pane/full-frame redraw and keeps the output surface bounded by rows that actually changed.
Cold path — named full-frame redraw. When the operator switches tabs, switches panes, toggles zoom, opens the palette, splits a pane, attaches a fresh client, browses scrollback, paints selection, or hits a cache/overlay/style case where partial repaint is unsafe, the server runs the full compose path:
- Render the top chrome.
- For each visible pane in the active tab, serialize every visible row through
PaneBodyCache::render_full. - Render pane borders and scrollbars.
- Render dialog overlay if one is open.
- Position the host terminal cursor at the focused pane’s
Screen::cursor_position()and honourScreen::hide_cursor().
Every full redraw carries a FullRedrawReason in crates/jackin-capsule/src/daemon.rs (first-attach, resize, tab-switch, layout-change, split-close, zoom-change, scrollback-movement, pane-clear, dialog-change, selection-repaint, pane-cache-miss, unsafe-partial, and related chrome/style reasons). With JACKIN_DEBUG=1, the renderer logs full vs partial, reason, dirty panes, emitted rows, emitted bytes, and duration in microseconds.
Input model: prefix key
Section titled “Input model: prefix key”Input handling is a small state machine, not a hardcoded byte intercept:
- Idle state — every byte from the client is forwarded to the focused pane’s PTY, except the configured prefix (
Ctrl+Bby default, matching tmux; configurable via env var or future config field). - Prefix-awaiting state — entered after the prefix byte arrives. The next key is consumed by the multiplexer and mapped to a command (see table below). If the next byte is the prefix again, a literal prefix byte is forwarded to the pane (
Ctrl+B Ctrl+B→ literalCtrl+B). After one command key, state returns to Idle. - Palette state — entered by
prefix + Spaceorprefix + :; renders the centred command palette; up/down/enter/escape consumed locally; everything else forwarded to the palette controller. - Dialog state — entered by palette actions that need a sub-pick (e.g.
New sessionshows the agent picker). Same key handling as palette state.
Default prefix bindings (subject to refinement during implementation):
| Key after prefix | Action |
|---|---|
c | New tab (opens agent picker) |
n / p | Next / previous tab |
0–9 | Jump to tab N |
" | Split current pane horizontally (top / bottom) |
% | Split current pane vertically (side by side) |
h j k l or arrow keys | Move pane focus |
z | Toggle zoom on focused pane |
x | Kill focused pane (collapses tab if last pane) |
& | Kill focused tab |
d | Detach client (PTYs keep running) |
Space or : | Open command palette |
Ctrl+L | Clear focused pane scrollback and send form-feed to request redraw |
r | Force redraw |
u | (Phase 4) Open usage and quota overlay for the focused pane |
| Prefix again | Literal prefix byte to PTY |
Ctrl+J is not a default binding. It is reserved as an opt-in alternate prefix that experienced operators may enable when they accept the 0x0A collision risk; the default never assigns it.
Wire protocol
Section titled “Wire protocol”The protocol is split by purpose:
- Control channel — connection-per-request, newline-delimited JSON, for
status,session.create,session.kill,session.title,events. This is what the host CLI and the future daemon use. Existing Phase 2 callers keep working. - Attach channel — persistent connection per attached client, framed binary. Each frame is
[1-byte tag][4-byte big-endian length][payload]. Tags coverhello { rows, cols, spawn, env },resize { rows, cols },input_bytes,output_bytes,command { json },welcome { session_count },session_list { json }, andshutdown. Payload ofinput_bytesandoutput_bytesis raw PTY bytes — no base64, no JSON nesting on the hot path. Payload ofcommandis JSON only because the command vocabulary is open-ended; it is used at human keystroke rate, not 60 Hz.
The hand-rolled base64 in crates/jackin-capsule/src/protocol.rs (split during the Phase 3 rewrite into protocol/attach.rs and protocol/control.rs) is removed. If any future feature ever needs base64 (e.g. embedding a binary blob inside a JSON event), the base64 crate is added — but no current code path needs it.
Resize, detach, reattach
Section titled “Resize, detach, reattach”- Resize. The client listens for
SIGWINCH(viatokio::signal::unixorsignal-hook-tokio), reads the new size withcrossterm::terminal::size, and sends aresizeframe. The server resizes every PTY in the affected layout viaMasterPty::resize, callsScreen::set_sizeon each, and performs a named full-frame replay. The first attempt’sHello { rows, cols }is preserved as the initial handshake; the missing piece is the runtime propagation. - Detach.
prefix + dmakes the client send adetachframe and exit cleanly, restoring its own terminal. The server drops the client slot but keeps every PTY running. The host candocker exec jackin-capsuleto spawn a fresh client which reattaches over the same socket. - Reattach. A new client sending
hellowhile one is already attached takes over; the server sends ashutdownframe to the old client (which restores its terminal and exits), then sends a named full replay of the active frame to the new client. Concurrent attach (multiple clients viewing the same session) is out of scope for the first rewrite; the takeover model is sufficient for jackin’ single-operator use case and avoids the multi-client cursor-and-input arbitration zellij has to solve.
Status bar
Section titled “Status bar”The top rows of the host terminal are owned by the multiplexer. Layout left to right:
jackin' [Claude ●] [Codex] [Shell]The jackin' brand pill uses the bright-green background defined in crates/jackin-capsule/src/statusbar.rs (kept from the first attempt). Each tab is rendered as Label with the rolled-up state glyph appended before the click-region width is computed (the existing bug appends it after, so click regions drift on ●/○). The active tab uses bold white text on a lifted graphite background plus the white underline so it does not visually merge with the brand pill; inactive tabs use dimmed grey.
The palette and prefix shortcuts remain active, but the top chrome does not render a menu or keybinding hint; those columns stay available for tabs.
When the bar is wider than the host terminal, tabs past the right edge are clipped and an overflow indicator (›) is shown. Tab navigation via prefix bindings still works on clipped tabs.
Usage and quota overlay
Section titled “Usage and quota overlay”Phase 4 work documented under Phase 3. The rendering contract is the status-chrome design (Phase 3 territory); the data source is the host daemon (Phase 4 territory). The subsection lives here so the status-bar layout decisions are colocated; nothing in this overlay ships in Phase 3.
Phase 4 should add a read-only usage overlay fed by the host daemon’s Token & Cost Telemetry cache. The Capsule does not poll providers and does not persist credentials. It asks the daemon only for the focused tab/pane’s provider account quota and workspace/session spend, then renders that focused-agent result inside the multiplexer. The compact signal belongs in the same low-noise status chrome as the branch/repository detail line, not as a second standalone widget.
Default shape:
- Status line: compact always-visible quota glyph for the focused tab/pane’s provider only, placed beside the branch/repository details when that line is present, e.g.
main · Codex 12%,feature/auth · Claude 80%,docs · Codex stale, ormain · Amp login. The multiplexer must not show a global account average or another agent’s quota while the operator is focused on one agent. - Prefix
u/ paletteUsage: modal dialog showing the focused agent’s provider, account label, quota window, remaining percentage, reset countdown, source confidence, cache freshness, current session tokens/cost, workspace tokens/cost, and burn-rate projection when available. The dialog may mention other accounts only through explicit navigation out of the focused-agent view; the default view is focused-provider only. - Dialog rows must label provider-authoritative quota separately from locally estimated workspace spend. A local JSONL estimate must never be displayed as the enforced subscription limit.
- If the daemon is disconnected, the overlay shows
usage unavailable: daemon disconnectedrather than falling back to in-container polling. - If the account needs authentication, the overlay shows
needs login/needs secretwith the repair action name, while the actual repair flow stays injackin console, Desktop, CLI, or host bridge.
This keeps the multiplexer useful while the operator is focused inside an agent: before starting an expensive run in a Codex tab, they see Codex headroom only; before starting work in a Claude tab, they see Claude headroom only. The operator opens the dialog only when they need reset timing, confidence, or workspace spend detail.
Initial state
Section titled “Initial state”When the daemon starts:
- If the host passes an initial agent argv (the normal case driven by
jackin load), the daemon spawns one tab running that agent. This matches today’s operator expectation thatjackin load the-architect .immediately drops them into Claude Code. - If no initial agent argv is provided and
/jackin/run/agent.tomllists at least one agent, the daemon can come up empty — row 0 brand bar, no tabs, no panes — and render a centred picker hint listing each configured agent plusShell. The first prefix-action or click chooses the first tab. - If the launch config lists no agents, the daemon spawns a single
Shelltab. This is the safe fallback for derived images that lack any agent CLI.
This pattern lets jackin load <role> . keep its existing direct-into-agent UX while leaving room for jackin console to launch the multiplexer with no preselected agent.
Module map
Section titled “Module map”The rewrite produces this layout under crates/jackin-capsule/src/:
| Module | Role |
|---|---|
main.rs | Mode selection (PID 1 → daemon, otherwise client subcommand) |
daemon.rs | Server event loop: tokio select! over PTY events, attach-channel frames, control-channel requests, timer ticks |
client.rs | Attach-client loop: raw mode, SIGWINCH, prefix-key state machine, attach-channel write |
pid1.rs | SIGCHLD reaper thread (kept verbatim from the first attempt — correct) |
socket.rs | Unix socket bind, control-channel accept, attach-channel accept |
protocol/control.rs | Newline-delimited JSON request/response types for status, session.create, etc. |
protocol/attach.rs | Tag-plus-length binary framing for the attach channel |
session.rs | One PTY + one vt100::Screen + state-inference timer; Session::spawn, Session::feed_pty, mode/OSC passthrough state |
layout.rs | Pane tree (kept, border math fixed) |
tab.rs | Tab struct + tab strip ordering |
statusbar.rs | Top-chrome renderer (brand pill, tabs, active-tab underline) |
input.rs | Prefix-key state machine + mouse SGR re-encoder |
dialog.rs | Command palette + agent picker (kept; revisit ergonomics in Phase 3b) |
Files deleted: terminal.rs (replaced by vt100 crate); base64 helpers in protocol.rs (replaced by binary attach channel).
Sub-phases of the Phase 3 rewrite
Section titled “Sub-phases of the Phase 3 rewrite”The rewrite lands as one cohesive PR (this one) because the defects are interlocked — fixing one without the others leaves the binary unusable for testing — but the work breaks into reviewable sub-phases. Each sub-phase passes cargo nextest run and cargo clippy --all-targets --all-features -- -D warnings on its own.
- Phase 3a — Replace VT. Drop
crates/jackin-capsule/src/terminal.rs. Addvt100tocrates/jackin-capsule/Cargo.toml. Wire eachSessionto its ownvt100::Parser+Screen. Replace the old per-byte terminal emulator with a compositor-aware row serializer overvt100cells. Tab switch and reattach replay from the saved screen instead offorce_redraw’s PTY-resize hack. - Phase 3b — Prefix-key input. Replace the
0x0Apalette intercept and the standalone Alt+arrow shortcut with the prefix-key state machine described above. Default prefixCtrl+B. The command palette still exists, opened byprefix + Space. Updatecrates/jackin-capsule/src/input.rsand the dialog dispatch incrates/jackin-capsule/src/daemon.rs. - Phase 3c — Persistent server + binary attach channel. Remove the
process::exit(0)on all-sessions-dead. On agent exit, mark the tabexited, keep theScreenfor replay, allowprefix + rto respawn. Replace the JSON + base64 attach-channel output frames with the tag-plus-length binary frames defined above. Keep the existing control channel forstatus. Remove the hand-rolled base64 helper. Fix the reattach outbound-channel bug. - Phase 3d — Resize, mouse, status bar polish. Add client-side
SIGWINCHlistener andresizeframe. Fix the mouse passthrough no-op (re-encode in SGR mouse form and write to the focused PTY). Fix the status bar tab-label width drift. Keep command-palette shortcuts active without rendering a top-bar hint.
Phase 3a is the load-bearing one — it unblocks every TUI agent. Phase 3b removes the worst input bug (Enter being swallowed). Phase 3c makes the daemon robust enough to live inside a long-running container. Phase 3d is polish that the rewrite cannot ship without but that doesn’t change the architecture.
Tests required
Section titled “Tests required”- VT contract tests. Round-trip alt-screen entry and exit through
vt100::Screenand assert that render replay after\x1b[?1049h … \x1b[?1049lreproduces the pre-alt-screen content. Smoke an entire Claude Code-style frame (alt-screen + bracketed paste + application cursor keys) and assert no information loss. - Prefix state machine tests. Lone prefix byte → consumed, no PTY output.
prefix + prefix→ exactly one prefix byte forwarded.prefix + c→NewSessioncommand emitted, no bytes forwarded. Plain LF and plainCtrl+Lin the input stream → forwarded to PTY (regression guard for theCtrl+J = 0x0Abug and clear-pane passthrough).prefix + Ctrl+L→ClearPane. - Persistence tests. Spawn a session, send it
exit\n, observeExitedevent, assert daemon is still running and the tab is markedexited. SendSIGTERMto the daemon, assert all PTYs are reaped and the daemon exits 0. - Reattach tests. Connect a client, attach to a session, write some bytes that update the
Screen, disconnect, reconnect. Assert the new client receives a full replay matching the screen state. - Resize tests. Send a
resizeframe, assert every PTY is resized viaMasterPty::resize, everyScreenviaset_size, and the active pane is replayed at the new dimensions. - Mouse passthrough tests. Send an SGR mouse press in row 5 — assert it is re-encoded relative to the focused pane’s origin and arrives at the pane’s PTY. Send a press in row 0 — assert it is consumed by the tab-strip hit-test.
- Control-channel regression tests. The existing
statuscommand incrates/jackin-capsule/src/client.rskeeps working unchanged. The host-side smoke tests for the launch path continue to pass without modification.
Terminal compatibility: tmux setting parity
Section titled “Terminal compatibility: tmux setting parity”The first tmux integration carried five non-default option choices that the operator had pinned over time to keep agent TUIs working. The new multiplexer must reproduce the same observable behaviour from day one — without those equivalents, the rewrite would regress on real workflows the operator already depends on. Each tmux option below is paired with the explicit jackin-capsule behaviour that replaces it.
Extended keys (always-on, advertised to the agent).
tmux set-option -s extended-keys alwaystmux set-option -as terminal-features 'xterm*:extkeys'extended-keys always is required (not on) because agent TUIs do not emit the per-app activation escape that on waits for; without always, Shift+Enter and other extended-key combinations silently fail. terminal-features 'xterm*:extkeys' advertises extkeys support to anything reading terminfo via TERM=xterm*.
jackin’ equivalent. The attach channel transports raw bytes both directions, so the multiplexer never needs to negotiate extended keys at all — it forwards exactly what the outer terminal sends, including the kitty keyboard protocol (\x1b[>1u / \x1b[<u), the xterm modifyOtherKeys sequences (\x1b[>4;2m), and the CSI-u-encoded modifier-plus-key sequences. Pane PTYs use the stable TERM=xterm-256color baseline; the active attach client’s TERM, TERM_PROGRAM, and COLORTERM travel in the Hello handshake and gate outer-terminal enhancements separately. When a TUI queries kitty_keyboard or modifyOtherKeys via DECRQM, the response comes from the agent’s own behaviour, not from the multiplexer — the multiplexer must never strip or rewrite those sequences. The corresponding test asserts that a Shift+Enter keypress (CSI-u sequence \x1b[13;2u) round-trips unchanged from client stdin to the PTY.
Focus events.
tmux set-option -g focus-events onAgents like Claude Code and Codex rely on focus-in / focus-out events (\x1b[I / \x1b[O) to pause animations, drop polling, and avoid stealing the operator’s attention when the operator switches to another window.
jackin’ equivalent. The client subscribes to focus-event reporting from the outer terminal on attach (\x1b[?1004h) and disables it on detach (\x1b[?1004l). Incoming focus events are routed to the focused pane in the active tab — and only to that pane; an unfocused tab’s agent should not believe it is focused. The server keeps a per-pane “outer-terminal-focused” flag and synthesises \x1b[I / \x1b[O when the operator switches tabs or splits panes, so each agent’s focus state matches what a human would expect when they look at that pane.
OSC passthrough (desktop notifications, progress bars, clipboard, titles).
tmux set-option -g allow-passthrough onWithout OSC passthrough, agent desktop notifications (\x1b]9;…\x07), progress reports (OSC 9;4 — ConEmu progress; OSC 99 — kitty progress), clipboard writes (OSC 52), and window titles (OSC 0/1/2) are silently dropped at the multiplexer boundary.
jackin’ equivalent. OSC sequences emitted by an agent’s PTY are forwarded to the attached client only when the originating pane is the focused pane in the active tab — same rule herdr uses, and the right rule because two backgrounded agents both writing the window title would otherwise fight. The multiplexer parses each OSC payload enough to (a) capture OSC 0/1/2 into the per-pane title field (used by the tab strip and the session.title API), (b) forward OSC 52 unconditionally for the focused pane (clipboard intent is explicit and the user issued it), and (c) forward OSC 9/9;4/99 (notifications, progress) only for the focused pane to avoid notification spam from backgrounded agents. Non-recognised OSC sequences are passed through unchanged. The vt100 crate exposes an osc_dispatch hook that the multiplexer wires into this routing layer.
Zero escape disambiguation delay.
tmux set-option -sg escape-time 0The default delay tmux waits after ESC to disambiguate ESC from the start of an escape sequence causes misfires in vi-mode navigation inside agent TUIs (especially in the agent’s command palette).
jackin’ equivalent. The input parser keeps a short, configurable JACKIN_ESCAPE_TIME deadline for a bare ESC while preserving complete escape sequences across chunk boundaries. This matches tmux’s disambiguation model while avoiding the original failure mode where ESC [ split across socket reads became a literal Esc plus stray text.
Mouse support, including motion.
tmux set-option -g mouse onMouse clicks, drags, and scroll events must reach agent TUIs that opt into mouse reporting via \x1b[?1000h through \x1b[?1006h.
jackin’ equivalent. The client sets the outer terminal to “any-event tracking + SGR mouse” (\x1b[?1003h\x1b[?1006h) on attach and disables alternate-scroll translation (\x1b[?1007l) so wheel gestures arrive as mouse input instead of prompt cursor keys. The multiplexer parses each mouse event: top-chrome events hit-test against the tab strip / menu button, bottom context-row events open GitHub or container details, pane-border events drive resize, and pane-content events are re-encoded relative to the focused pane’s rect origin and written to that pane’s PTY only when the pane opted into mouse reporting. Wheel events in mouse-disabled normal-screen panes drive jackin’ scrollback when vt100 or inline-scroll capture has retained history, including rows preserved before normal-screen clear/redraw and top-anchored scroll-region movement; that same retained history is the only source for pane scroll chrome, for both agent-spawned and shell-spawned panes. Empty-history wheel input stays local to jackin’ rather than becoming cursor-key input. Wheel events in mouse-disabled alternate-screen panes use cursor-key fallback so full-screen programs keep owning their interaction surface. Mouse mode and encoding are tracked with vt100::Screen::mouse_protocol_mode() and mouse_protocol_encoding(), so SGR, default xterm/X10, and UTF-8 mouse modes are distinct on the PTY side.
The setting parity above is a regression contract: each tmux behaviour the operator pinned has a matching jackin-capsule behaviour. The tests under “Tests required” assert each contract separately so future refactors cannot drop one silently.
Ghostty compatibility
Section titled “Ghostty compatibility”The operator runs jackin inside Ghostty, the outer terminal. Ghostty’s feature set is wider than xterm’s, so the multiplexer must transport Ghostty’s richer keyboard, color, and graphics protocols intact rather than fall back to a lowest-common-denominator subset.
Kitty keyboard protocol is the load-bearing one. Ghostty implements the kitty keyboard protocol (CSI-u sequences with disambiguated modifiers, full key release reporting, all-event mode). Modern agent TUIs query for it on startup with \x1b[?u. The multiplexer must:
- Forward the agent’s
\x1b[>{flags}uand\x1b[<{n}upush/pop sequences to the outer terminal unchanged, so Ghostty switches the keyboard protocol level to what the agent asked for. - Forward Ghostty’s responses (encoded keys, including modifier-plus-key combinations that legacy terminals cannot encode at all —
Ctrl+Shift+Tab,Alt+/, etc.) to the agent’s PTY unchanged. - Not intercept any byte inside a CSI-u sequence. The prefix-key state machine recognises the configured prefix byte at sequence boundaries only.
True-color and color queries. Ghostty supports 24-bit color and replies to \x1b]10 / \x1b]11 (foreground/background queries) and \x1b]4;<n>;?\x07 (palette queries). Agents query these to theme themselves to the outer terminal’s palette. The multiplexer forwards both directions verbatim and uses vt100::Screen’s 24-bit color cell model so attribute fidelity is preserved through tab-switch replay.
Kitty graphics protocol. Ghostty supports the kitty graphics protocol (APC escape sequences for inline images). Agents that render diagrams or screenshots inline emit \x1b_G…\x1b\\. The multiplexer treats APC sequences exactly like OSC: forward unchanged for the focused pane only; ignore for backgrounded panes (an image emitted by a background pane should not appear in another pane’s region). The vt100 parser delivers APC payloads via the hook / put / unhook callbacks; the multiplexer captures the bytes and re-emits them to the client when that pane is the focused pane being rendered. APC sequences are explicitly not “replayed” on tab switch — only what the agent currently has on screen matters, and Ghostty handles redraw of already-emitted images itself.
Bracketed paste. Ghostty wraps pasted text in \x1b[200~ / \x1b[201~ when the agent has enabled bracketed paste (\x1b[?2004h). The multiplexer must forward these wrappers untouched to the pane’s PTY. This is one of the DEC private modes the first hand-rolled VT silently dropped; vt100::Screen::bracketed_paste_mode() tracks the agent’s intent and the multiplexer respects it on the input direction.
Synchronised output (BSU/ESU). Ghostty supports the “synchronised output” mode (\x1b[?2026h/l — also called BSU/ESU). Agents that draw in many small writes wrap each frame in synchronised-output markers so the outer terminal renders the frame atomically and the operator does not see partial paints. The multiplexer must forward both directions of these markers when the focused pane is the active one. For non-focused panes the markers are still consumed by vt100’s parser (so the Screen updates atomically internally) but no client output is emitted, which is the desired behaviour anyway — backgrounded panes have no client surface to paint.
OSC 52 clipboard. Ghostty supports OSC 52 clipboard writes. The multiplexer forwards them per the OSC passthrough rules above (focused pane only). This gives agents that copy code blocks or commands a real clipboard hand-off — without it, the “copy to clipboard” button on Claude Code’s output silently does nothing.
OSC 8 hyperlinks. Ghostty renders \x1b]8;;<url>\x1b\\ clickable. Forwarded unchanged for the focused pane.
Theme query (OSC 10/11/4) and dark/light mode (DCS $q… for tmux is irrelevant here; Ghostty exposes mode via OSC). Forwarded unchanged. The multiplexer does not interpret the outer terminal’s theme; agents that want to adapt query directly.
Sixel graphics. Not currently used by the in-scope agent TUIs but Ghostty supports it. Treated identically to kitty graphics: forwarded for the focused pane, ignored for backgrounded panes, not replayed on switch.
The unifying rule across every Ghostty feature is: the multiplexer’s job is to be a transparent pipe for whatever escape sequences the agent and the outer terminal negotiate, while owning the top chrome and the prefix-key boundary. Anything else it tries to interpret will eventually break a feature Ghostty supports that jackin’ does not yet know about. The vt100 crate’s job is to keep enough state to replay panes on switch and detect activity for the agent-state heuristic — it is not an oracle for what to pass through, and the multiplexer should not gate features on vt100 understanding them.
A regression test runs an end-to-end smoke against Ghostty’s reported features: emit \x1b[?u (kitty keyboard query) from a synthetic PTY, capture the client output, assert the byte sequence reaches the outer-terminal-facing stream unchanged. Repeat for OSC 52, OSC 9, OSC 8, BSU/ESU, and the kitty graphics protocol header. The test does not require an actual Ghostty instance — it only asserts the bytes survive the multiplexer round-trip.
What does not change
Section titled “What does not change”- The host-side
docker runmount of/jackin/run/<instance-id>.sock:/jackin/run/jackin.sock. The socket location, ownership, and lifetime are unchanged. - The derived image build pipeline that downloads or builds
jackin-capsuleand installs it at/jackin/runtime/jackin-capsule.src/bin/build_jackin_capsule.rsstays as-is. - The Phase 2
statusAPI on the control channel. Existing host callers do not need to change. - The
/jackin/run/agent.tomllaunch-config contract between the host launcher and the daemon, plus the initial-agent argv contract from the host launcher into PID 1. - The PID 1 zombie-reaping logic in
crates/jackin-capsule/src/pid1.rs. - The four-state agent-state model (
working/blocked/done/idle) and its roll-up rules.
Host-side changes per phase
Section titled “Host-side changes per phase”Phase 2: add socket mount to docker run; update inspect_agent_sessions in src/runtime/attach.rs to connect to socket instead of docker exec tmux list-sessions.
Phase 3: remove tmux from the derived image (docker/construct/Dockerfile); replace all docker exec tmux ... call sites in src/runtime/attach.rs and src/runtime/launch.rs with socket API calls.
Prior art: Herdr
Section titled “Prior art: Herdr”Herdr (ogulcancelik/herdr) is the closest public reference for the multiplexer server concept. It is a single Rust binary, built for exactly the same problem: managing multiple AI coding agents with per-project workspace grouping, four-state status tracking, a Unix socket API, and session persistence across client detach.
Herdr cannot be embedded in or linked from jackin’: it is AGPL-3.0, which conflicts with jackin’ Apache-2.0 license. Any reimplementation in jackin-capsule must be written from scratch. Ideas, concepts, and algorithms are not protected by copyright; the code that expresses them is.
The deeper architectural difference: Herdr wraps bare host processes. jackin’ wraps Docker containers. Herdr’s status heuristics read foreground process state and PTY output from the agent’s own terminal. When the agent runs inside a container and Herdr sees docker attach rather than the agent, the heuristics degrade. jackin-capsule runs inside the container — it sees the agent’s PTY output directly, making the same heuristic approach reliable.
Concepts to implement independently
Section titled “Concepts to implement independently”Two-stage done state. Herdr distinguishes Done (work finished, not yet reviewed) from Idle (reviewed or empty). jackin’ AgentStatus model should adopt the same split for the same reasons: the autonomous task queue must not refill a done slot until the operator acknowledges, and the desktop app needs a distinct “ready for review” surface. Whether this lands as a ReadyForReview enum variant or as Idle + a separate acknowledgement flag is a design detail for Phase 3.
Workspace-level status roll-up. Herdr’s workspace sidebar shows the most urgent child status at the workspace level — one blocked agent makes the whole workspace show Blocked. The attention-priority order jackin’ should use is blocked > done > working > idle > unknown: an unseen finished pane is ready for the operator now and should outrank merely background working panes. The console Instances panel should adopt this roll-up rule once agent-state events from Phase 3 are available.
Notification suppression when already looking. Herdr suppresses sound and toast notifications when the relevant pane is focused. jackin’ attention prompts design should include the equivalent: if the console currently has that workspace row focused, downgrade or skip the OS notification. The precise condition needs design (console selection is coarser than Herdr’s pane-level focus).
Sound escalation as opt-in. Herdr ships silent toast by default, with opt-in sound escalation after N seconds. This pattern is validated by Herdr’s live usage and should stay in the attention prompts V1 design.
Blocking wait semantics on the socket. Herdr’s socket API lets callers block until a status transition (herdr wait agent-status 1-1 --status done). This is a better interface than polling for automation scripts and for the daemon’s event subscription. The events stream in Phase 3 should support the same pattern: a subscriber blocks on the stream and receives the event when the transition occurs.
Layered state authority. Herdr detects agent state by combining foreground process state, visible terminal-screen signals, and semantic integration reports. jackin-capsule will use the same concept — with the advantage that it runs inside the container and reads the agent’s real PTY and process tree directly rather than through a docker attach wrapper. Output activity remains a weak working signal, but silence alone must not mean blocked; the dedicated agent runtime status authority owns the full arbitration model.
What not to borrow from Herdr
Section titled “What not to borrow from Herdr”- Herdr as the session substrate. Herdr manages bare-host PTY processes.
jackin-capsuleneeds a PTY multiplexer that runs inside a container, is statically linked, and has no host-side dependencies. Herdr’s internals are the wrong shape for this. - Herdr’s theme system. jackin’ TUI uses a fixed palette. Themes are not a priority.
- Herdr’s SSH remote tunneling. Herdr’s transparent PTY forwarding over SSH has a different threat model than the explicit operator-controlled tunneling planned for jackin-remote.
- Herdr’s wire protocol verbatim. The socket API vocabulary should be designed for jackin’ data model. Herdr is inspiration for the
waitsemantics, not a drop-in spec.
Implementation plan
Section titled “Implementation plan”Phase 1 — Cleanup gap closed (shipped — bash supervisor + host-side teardown)
Section titled “Phase 1 — Cleanup gap closed (shipped — bash supervisor + host-side teardown)”Historical pre-Capsule state: containers used tmux as the session layer. That temporary arrangement used docker exec tmux new-session to create sessions, docker exec tmux attach-session to reconnect, and docker exec tmux list-sessions to query what was running. The bash supervisor wrapped tmux from the outside and had no understanding of what happened inside sessions. It worked as a minimal path, but gave jackin’ no visibility into agent state, no structured event stream, and no way to manage sessions without shelling into the container.
What shipped in Phase 1: docker/runtime/supervisor.sh (since deleted) polled tmux list-sessions every second rather than watching the socket file disappear. Using tmux list-sessions was robust to stale socket files left behind when tmux crashed or was killed without cleanup — a plain [ -S socket ] file-existence check would loop forever on a dead socket. When tmux list-sessions returned non-zero (server gone or no sessions), the supervisor exited 0. A 60-second startup grace period waited for the socket file to appear before entering the monitor loop, preventing a premature exit before the first docker exec tmux new-session created it.
Two companion fixes in the host-side Rust code were required to make the cleanup path actually fire. src/isolation/finalize.rs (finalize_foreground_session) now calls has_tmux_sessions when the container is still Running after docker exec returns: an empty session list means supervisor lag, not a detach, so the code falls through to finalize_clean_exit and sweeps isolation worktrees normally instead of returning Preserved. src/runtime/launch.rs now tears down the DinD sidecar and Docker network in every branch where the container has exited — including crashes (non-zero exit) — rather than preserving them for a jackin hardline reconnect that cannot work without a live DinD. When docker exec returns and the container is still Running with an empty session list, teardown fires immediately without a redundant re-query (the session check in finalize_foreground_session already confirmed no sessions are present). Only a genuinely live detach (container still Running AND sessions confirmed present) keeps the DinD alive for reconnect.
Why a purpose-built multiplexer made sense: researching Herdr — a Rust terminal multiplexer purpose-built for AI coding agents — confirmed that the missing piece was a session layer that understands AI agents natively. Herdr tracks four agent states (blocked / working / done / idle) with zero configuration, exposes a Unix socket API for session control, and streams status-change events. It is designed for bare host processes (not containers) and is AGPL-3.0, so it cannot be embedded in jackin’. But the core concept is exactly right: a multiplexer that knows about agents, not just terminals. jackin-capsule is that in-container multiplexer/control plane, where it can see the agent’s PTY output directly and expose it through a Unix socket that the host and desktop app both drive.
Phase 2 — Rust binary skeleton + Unix socket status command (structured session inventory)
Section titled “Phase 2 — Rust binary skeleton + Unix socket status command (structured session inventory)”- Create
crates/jackin-capsule/workspace member. - Implement PID 1 bootstrap: zombie reaping via
SIGCHLD,SIGTERM/SIGINT→ clean exit. - Implement tmux socket watch via
inotify(replaces bash polling). Exit 0 when socket deleted. - Implement Unix socket listener and
statuscommand. - Add CI job: cross-compile for
linux/amd64andlinux/arm64, publish to GitHub Releases. - Add socket mount to
docker runinsrc/runtime/launch.rs. - Update
inspect_agent_sessionsinsrc/runtime/attach.rsto query socket instead ofdocker exec tmux list-sessions. - Update
hardline --inspectand console session inventory to use socket path. - Remove
docker/runtime/supervisor.sh(pre-release; no migration shim needed).
Phase 3 — In-container multiplexer (replace tmux — rewrite in flight)
Section titled “Phase 3 — In-container multiplexer (replace tmux — rewrite in flight)”The first Phase 3 attempt landed PID 1 ownership, the PTY layer, a binary pane tree, a status bar, a Unix socket attach channel, and the dockerfile changes that drop tmux from the derived image. That work is preserved where it is sound (the layout tree, the SIGCHLD reaper, the host-side mount changes, the build pipeline). The pieces that are unusable for modern TUI agents — the hand-rolled VT, the 0x0A palette shortcut, the daemon’s exit-on-last-session, the mouse no-op, the JSON-plus-base64 hot path — are rewritten against the architecture defined in “Multiplexer architecture (Phase 3)” above. The rewrite lands as one cohesive change in the active Phase 3 pull request, broken into four reviewable sub-phases:
- Phase 3a — Replace the VT. Delete
crates/jackin-capsule/src/terminal.rs(deleted in the Phase 3 rewrite — replaced bysession.rs). Addvt100tocrates/jackin-capsule/Cargo.toml. Hold onevt100::Parserplusvt100::ScreenperSession. Switch the tab/pane switch and reattach paths to full replay from the saved screen. HonourScreen::hide_cursorandScreen::cursor_positionwhen positioning the host cursor. - Phase 3b — Prefix-key input. Replace the
0x0Apalette intercept with the prefix-key state machine. Default prefixCtrl+B. Default bindings as listed in “Input model: prefix key” above. Add the “Escape forwards immediately” rule (zero-delay disambiguation). - Phase 3c — Persistent server + binary attach channel. Remove the
std::process::exit(0)on all-sessions-dead. On agent exit, mark the tabexited, keep the screen for replay, allow respawn via prefix bindings. Replace the JSON-plus-base64 attach-channel output with the tag-plus-length binary format defined in “Wire protocol”. Drop the hand-rolled base64 fromcrates/jackin-capsule/src/protocol.rs(split during the Phase 3 rewrite intoprotocol/attach.rsandprotocol/control.rs). Fix the reattach outbound-channel drop. - Phase 3d — Resize, mouse, terminal compatibility, status bar polish. Add the client-side
SIGWINCHlistener andresizeframes. Fix the mouse passthrough no-op. Wire focus events, OSC/APC passthrough rules, BSU/ESU, kitty-keyboard pass-through, and OSC 52 routing per “Terminal compatibility: tmux setting parity” and “Ghostty compatibility” above. Reassert a multiplexer-owned outer terminal title from the workspace and branch / pull request context. Poll local Git branch state separately from cached GitHub PR/check lookups so branch changes update the status bar quickly without aggressiveghcalls. Fix the status bar tab-label width drift. Keep command-palette shortcuts active without rendering a top-bar hint.
Phase 3 also keeps the control-channel work the first attempt completed:
- Implement the agent runtime status authority: raw
working/blocked/idle/unknown, deriveddone, foreground-process detection, semantic runtime reports, visible-screen signals, stale-report arbitration, and stuck diagnostics. Silence alone is not a blocker signal. - Implement the two-stage
done/idlesplit and the operator-acknowledgement mechanism. - Expand control-channel API:
session.create,session.kill,session.title,events. - Implement event stream:
session-started,session-ended,agent-state-changed. - Implement per-tab status roll-up (
blocked > done > working > idle > unknown) on the host side, consuming events from the control channel. - Remove
tmuxfrom the derived image (docker/construct/Dockerfile). - Replace all
docker exec tmux …call sites in the host CLI with control-channel calls. - Update console session panel to consume
agent-state-changedevents rather than polling. - Update attention prompts to subscribe to
blockedstate events rather than doing PTY polling from the host.
session.attach is intentionally not on the control channel — attach is the persistent binary attach channel defined in “Wire protocol”. The first attempt accidentally collapsed the two into one socket; the rewrite separates them.
Phase 4 — Daemon integration and desktop app bridge (deferred)
Section titled “Phase 4 — Daemon integration and desktop app bridge (deferred)”- Daemon connects to each running container’s socket and maintains a live session index across all containers.
- Desktop app reads from the daemon’s aggregated view; each
jackin-capsulesocket is the per-container data source. The multiplexer IS the server; the desktop companion reads what is happening inside the container through it. - Advanced commands: session snapshots, resource usage per session, log streaming.
- See jackin’ daemon and jackin’ Desktop Agent Hub.
Relationship to other roadmap items
Section titled “Relationship to other roadmap items”- Agent Orchestrator Research Program — Herdr is evaluated there as the strongest prior art for the multiplexer vision. The full comparative table (Herdr vs. jackin’ values) lives in the research overview.
- Console agent session control — Phase 4 of that item unblocks once Phase 2 of this item ships: the binary exposes live session state, eliminating manifest-snapshot reconciliation.
- Agent runtime status authority — the
agent-state-changedevent stream from Phase 3 is the delivery mechanism for blocked/working/done/idle/stuck indicators in the console and hardline. - Agent attention prompts —
blockedevents replace PTY polling from the host;doneevents trigger the “ready for review” notification path. - jackin’ daemon —
jackin-capsuleis the per-container endpoint the daemon subscribes to. Phase 4 of this item and the daemon’s container-watch phase are designed together. - jackin’ Desktop Agent Hub — Phase 3 of this item is the prerequisite: the desktop app’s live session view is driven by
jackin-capsulesocket events aggregated by the daemon. - jackin-remote — remote containers will need a
jackin-capsulesocket accessible over the tunnel; the socket mount and auth model need to account for the remote path.