Skip to content

Orca ADE research

Status: Open — proposed research item (benchmarking reference for UX and worktree-native agent orchestration)

Document what Orca gets right as a desktop Agent Development Environment so jackin’ can borrow its strongest UX ideas without adopting its weaker isolation model.

Orca and jackin’ are solving adjacent problems. Orca is a native GUI app that treats the git worktree as the unit of agent isolation and wraps it in a rich multi-agent interface. jackin’ is a CLI/TUI orchestrator that treats the Docker container as the isolation unit and wraps that in a role-manifest system. The overlap is large enough that Orca’s design decisions are directly relevant to the jackin’ fleet-operations and operator-surface roadmap.

Orca (by Stably AI) is a desktop Agent Development Environment (ADE) whose core premise is: AI-assisted development means running many agents in parallel, and the tooling should be designed from the ground up for that. Its tagline is “Ship 100x With The Agent IDE.”

  • Language: TypeScript (97.6%) + Swift (macOS-specific native layer)
  • Architecture: Electron-style desktop app with WebGL terminal rendering, embedded Chromium per worktree, and a VS Code editor component
  • Agent runtimes: 30+ built-in CLI agents — Claude Code, Codex, Gemini, Grok, Amp, OpenCode, Cursor CLI, GitHub Copilot CLI, and others
  • License: MIT — fully open source, no SaaS account required, no data pipeline
  • Distribution: Homebrew cask, AUR, direct download (.dmg, AppImage, .exe), GitHub Releases
  • Companion: iOS + Android apps for remote monitoring and task management
  • Activity: ~3,100 GitHub stars as of May 2026; multiple releases per week; v1.4.x is the active series

Worktree-per-task isolation. Every task gets its own git worktree. Agents run in those worktrees independently — no branch juggling, no stashing, no cross-contamination between concurrent tasks. Up to N agents can work simultaneously on different features.

Unified terminal management. Tabbed and paned WebGL-rendered terminals, one per agent session. Visual indicators for active/interrupted/stalled agents. Infinite scrollback with persistence across restarts and full-text search.

Diff annotation feedback loop. The single most distinctive feature. Operators add Markdown comments to specific diff lines, batch them, and send the batch back to the agent as structured feedback. Turns code review into a directed feedback queue rather than a copy-paste loop.

Embedded Chromium per worktree (Design Mode). A real browser window is associated with each worktree’s dev server. Agents can be asked to look at the live UI from inside Orca without the operator switching contexts.

Orca CLI (agent-driven control). Agents running inside Orca can call back into Orca programmatically — open a new pane, navigate, surface results. This inverts the standard model where the IDE drives the agent; here the agent can drive the IDE surface.

Built-in source control. Inline diff review, line-level annotation, commit workflow, and branch comparison without leaving Orca.

GitHub and Linear integration. PRs, issues, GitHub Actions checks, and Linear tasks are surfaced directly inside the worktree view, giving agents immediate link context.

SSH remote worktrees. Agents running on a remote machine with file editing, git support, port forwarding, and auto-reconnect. Surfaced ports appear in a ports panel.

Multi-account management. Claude Pro, Codex, Gemini, and other subscriptions tracked in one place with usage monitoring and rate-limit visibility.

Quick Commands surface. Palette of shortcuts for common actions (introduced in v1.4.21).

Mobile companion apps. iOS and Android for monitoring agents and managing tasks remotely.

  1. Right abstraction level. The worktree-per-task model matches how experienced git users want to work with parallel agents — complete isolation, zero noise between tasks.
  2. Desktop-native ambition. Most “AI coding tools” are VS Code extensions or web apps. Orca is a full native application with its own terminal stack, embedded browser, and mobile companion.
  3. No lock-in. Users bring their own subscriptions. MIT license. No data routed through Stably’s servers. This matters strongly in the “parallel agent fleet” use case.
  4. Active maintenance. Three releases in two days (May 22–23, 2026) with 30–50 changes each; issues get fixed quickly; release notes are detailed with contributor attribution.
  5. The diff annotation loop lands. Users who try it find it hard to give up — it makes code review a directed structured workflow.

Every borrowed idea must survive the values filter from the Agent Orchestrator Research program.

Orca ideaPasses values filter?Notes
Worktree-per-task visual orchestrationYesjackin’ already has worktree/clone per-mount isolation; the UX around naming, status, and post-session review is the gap
Diff annotation → agent feedback batchYesPure UX — no isolation model change needed
Embedded Chromium per worktreePartialGUI-only; useful as a roadmap target for a future desktop companion, not a CLI feature
Orca CLI (agent-driven pane control)YesAnalogous to the operator handler system and agent tag protocol; agents signalling state rather than driving UI
30+ built-in agent runtimes, zero configYesAligns with multi-runtime support goal; jackin’ should match this runtime breadth
Mobile companion (monitor, manage)YesLower priority; relevant to jackin’ Desktop Agent Hub
Host-native worktrees as isolation unitNoThe core isolation model is the reason jackin’ uses containers. Host-native agents run as the host user; the hard rule against host-side mutations blocks this as a default
Same-process agent and IDE surfaceNoOrca runs agents directly on the host machine. The Capsule model is the reason jackin’ exists
DimensionOrcajackin’
Primary isolation unitGit worktreeDocker container
What is isolatedGit state (working tree, branch)Full runtime environment (filesystem, network, tools, credentials)
Agent execution contextHost machine, host userContainer — agent cannot reach host filesystem without explicit mount
Host mutation riskHigh — agents run as host userLow by design — host-side writes are hard-blocked
Security boundaryProcess-levelContainer-level
Configuration surfaceGUI-first, no role/manifest conceptRole manifests (jackin.role.toml), config.toml, per-workspace config
ReproducibilityWorktrees share host-installed toolsContainer image pins the full toolchain
Target workflowDeveloper who trusts agents and wants parallel throughputOperator who wants a clean, reproducible, auditable agent environment
Primary UIFull desktop GUI with terminal embedsCLI with jackin console as TUI
Agent runtime breadth30+ pre-configured, zero config5 built-in (Claude Code, Codex, Amp, Kimi, OpenCode); see multi-runtime
Diff review loopAnnotation → batch feedback to agentNot yet in jackin
Mobile companioniOS + AndroidNot yet; see jackin’ Desktop Agent Hub
Port / service visibilityPorts panel per worktreeNot yet explicit; tracked in network egress policy
LicenseMIT, no SaaSApache 2.0, no SaaS

These Orca patterns are UX-transferable to jackin’ without adopting its isolation model.

The workflow of marking up diff lines with comments and sending the batch to the agent as structured feedback is pure UX — it does not depend on where the agent runs. The operator handler system and the console’s session control surface are the right home for this. A future operator action could batch annotated diff hunks into a follow-up agent prompt inside the same container.

Orca’s main view is a table of worktrees with live agent status. The jackin’ console workspace tree is the analog. The gap is status richness: Orca shows active/interrupted/blocked; jackin’ shows limited instance state. This is exactly the agent runtime status gap.

Orca surfaces which ports a dev server published inside each worktree session. jackin’ should do the same for container-exposed ports, and it belongs as a section in the session contract and explain mode — “which ports are exposed and to what host address.”

The Orca CLI gives agents programmatic control of the UI surface. In the jackin’ model, the equivalent is agent tag protocol (structured tags agents emit) and agent attention prompts (OS notifications when agents stall). These are the container-compatible analogs.

Orca’s zero-config 30+ agent support is the user expectation jackin’ should match in the built-in runtime list. Multi-runtime support tracks this.

The iOS/Android companion for remote monitoring and task management is an idea for the jackin’ Desktop Agent Hub roadmap item, which uses the daemon as its state/event backend.

Current gaps visible in Orca’s issue tracker and release notes (May 2026):

  • Self-hosted relay server for mobile-desktop connections — multiple open requests; critical for security-sensitive users
  • SSH Include directive support in SSH config import
  • Multi-select / batch operations on worktrees
  • Agent Teams / subagent visibility — when Claude Code spawns subagents (Agent Teams), they create hidden tmux sessions that don’t appear as native Orca panes
  • Subagent permission request UI — when a subagent awaits a permission prompt, the workspace status does not update; operators miss stalled agents
  • LSP integration — community-requested but not yet designed
  • Windows edge cases — Antigravity CLI key input does not propagate correctly inside Orca’s terminal on Windows
  • Workspace/worktree decoupling — some users want to create a workspace without immediately creating a worktree

Several of these gaps — subagent visibility, permission-stall detection — are exactly what the Capsule model can solve cleanly because the container boundary gives jackin’ a uniform observation point for agent lifecycle events.

Orca’s core bet is that worktrees are sufficient isolation. That works for a developer who trusts their agents and wants parallel throughput on one machine. It does not work for the jackin’ model where:

  • The operator’s host machine must not be mutated silently (hard rule in AGENTS.md)
  • Credentials must not leak between agents or sessions without explicit forwarding
  • The agent environment must be reproducible from a role manifest across machines
  • Isolation must be explained and auditable before launch (see session contract and explain mode)

Orca is richer in UX and lighter in isolation. jackin’ is the reverse. The UX ideas above are the ones that survive the transplant; the isolation model cannot be borrowed without collapsing the core product value.

  1. Diff annotation UX. Is the annotation-batch-to-agent pattern implementable as a console action without a GUI? What is the minimal operator surface — a file with annotated hunks, a structured prompt template, or something else?
  2. Runtime breadth. What is the blocking work to reach 10+ supported runtimes in jackin’? Does it require per-runtime launch config or can a generic exec-based launcher cover most cases?
  3. Subagent visibility. How does Orca’s hidden-tmux-session problem map to the Capsule model? jackin’ can observe container process trees; is that sufficient to detect and surface subagent sessions?
  4. Port panel. What is the minimum viable port-visibility surface for the console? A ports column in the workspace table? A dedicated pane? A session contract section?
  5. Self-hosted relay. Orca’s mobile-companion relay is a separate infrastructure concern. Does the jackin’ daemon design (see jackin’ daemon) make self-hosted relay a natural Phase 2 feature or an unrelated problem?

This item extends Agent Orchestrator Research with a UX-first desktop tool. Orca does not fit neatly into Track A (fleet ops, multicode-style) or Track B (containment, Hazmat/Docker Sandboxes-style). It is a third reference point: what the user-facing layer looks like when the isolation concern is handed off to git instead of containers.

The strongest cross-references:

This itemRelated roadmap item
Diff annotation feedback loopOperator handler system
Worktree/session status tableAgent runtime status, Console agent session control
Port panel per sessionSession contract and explain mode, Network egress policy
Agent-driven callbackAgent tag protocol, Agent attention prompts
Runtime breadthMulti-runtime support
Mobile companionjackin’ Desktop Agent Hub
Subagent visibilityjackin-capsule: in-container control plane

Research snapshot: May 2026.

  • stablyai/orca — active reference implementation
  • Orca releases — release cadence and feature log (v1.4.x active series as of May 2026)