Skip to content

Declarative Resource Limits per Agent

Status: Open — design proposal (Phase 1, Agent Orchestrator Research Program)

jackin’ runs every agent in a Docker container with whatever resource allocation the host gives it. On a developer laptop, six parallel agents can each spawn a cargo build, exhaust memory, OOM-kill the desktop, or starve each other for CPU. There’s no operator-facing control for this today; the operator’s only knob is “launch fewer agents.”

Docker exposes the right primitives (--memory, --cpus, --ulimit nofile=N, plus --memory-reservation for soft limits), but jackin’ doesn’t plumb any of them through.

multicode addresses this directly with three declarative fields: memory-high (soft limit, triggers reclaim), memory-max (hard limit, triggers OOM kill), cpu (quota as percentage), and nofile (FD ceiling on Apple-container backends).

  • Parallel agents are unsafe today on resource-constrained hosts. This is a literal correctness gap: a runaway agent can take down the operator’s whole machine.
  • The autonomous queue (Phase 4) is unusable without it. Five queued agents with default-unlimited memory and CPU is a recipe for OOM kills the moment two of them happen to be running cargo build simultaneously.
  • Cross-backend resource translation is the right home for this design. Docker, Apple container, and the planned selectable sandbox backends each express limits differently — a declarative layer means each backend translates once.

Sources:

[isolation]
memory-high = "12 GiB" # soft limit; triggers cgroup memory.high
memory-max = "16 GiB" # hard limit; triggers OOM at this point
cpu = "300%" # 3 CPU cores worth of quota
nofile = 16384 # FD ceiling (Apple container only)

multicode parses these via the size crate (decimal 12 GB and binary 16 GiB both supported), expands shell variables, then maps them onto systemd-run --property MemoryHigh=... etc. — each backend has its own translator, but the config surface is uniform.

multicode also tracks runtime metrics that complement the limits: current RAM, CPU %, and crucially OOM kill count (sampled from systemd memory pressure counter). When an agent gets OOM-killed, the operator sees it.

The right level for these fields is the role manifest, not the operator config or workspace config. Reasoning: limits scale with the toolchain (a Rust agent with cargo build needs more headroom than a Go agent), and the role is where toolchain choices live. Operator/ workspace overrides come later if a use case surfaces.

jackin.role.toml
version = "v1alpha2"
dockerfile = "Dockerfile"
[runtime.limits]
memory_high = "12 GiB" # soft (Docker --memory-reservation)
memory_max = "16 GiB" # hard (Docker --memory)
cpus = "3.0" # Docker --cpus (string for "1.5", "300%")
nofile = 16384 # Docker --ulimit nofile=N:N
[runtime.limits.oom]
preserve_state = true # don't auto-clean an OOM-killed instance
notify = true # surface OOM in console (depends on Phase 2 status)

memory_high is optional; absent means same as memory_max. nofile is optional and defaults to host. cpus accepts both fractional and percentage forms ("3.0" and "300%" are equivalent).

Terminal window
jackin load <agent> --memory-max 8GiB --cpus 2.0

Operator override is a V1 nicety, not a config-file substitute. Useful for “this one launch is on a smaller machine.”

Each backend implements a ResourceLimits translator:

  • Docker (today): --memory, --memory-reservation, --cpus, --ulimit nofile=N:N. oom_score_adj if needed for preserve-state.
  • Apple container (when selectable backends ships): direct per-allocation limits.
  • systemd-run / bwrap (if it ever lands): cgroups properties.

A backend that can’t honor a declared limit (e.g. nofile on a container runtime that doesn’t expose it) emits a warning at launch and proceeds — not a hard error. Operators see the gap; the agent still runs.

  • [runtime.limits] block on jackin.role.toml with the four fields above.
  • --memory-max / --memory-high / --cpus / --nofile flags on jackin load for one-shot overrides.
  • Docker translator only in V1 — that’s the only backend.
  • size-crate-style parsing (binary and decimal); reuse a small parser rather than pulling the dependency.
  • Defaults: no limits applied if the field is absent (matches today).
  • [runtime.limits.oom] block: preserve_state defaults to true, notify defaults to true (no-op until Phase 2 lands).
  • Per-workspace overrides. Manifest-level only in V1.
  • Disk I/O limits (--blkio-weight). Useful but harder to reason about; defer to user request.
  • Network bandwidth limits. Defer indefinitely.
  • Auto-pausing OOM-killed agents instead of killing the container. Docker doesn’t expose a clean way; revisit per-backend later.
  • Should cpus accept percentages explicitly? "3.0" is unambiguous for Docker. "300%" matches multicode but maps awkwardly to Docker (which doesn’t accept the percent sign). Recommended default: accept both at parse time, normalize to fractional core count internally.
  • Should manifest limits be inheritable across roles? If org/base declares memory_max=16GiB and org/derived extends it, does the derived class inherit? Recommended default: yes, with override semantics — but role inheritance is a separate, larger design question and probably out of scope for V1.
  • OOM preserve_state interaction with worktree cleanup (the shipped per-branch safety policy described in Per-mount isolation). An OOM-killed instance should always preserve. The cleanup helper already handles non-zero exits; OOM is a special case of that. Confirm the existing helper sees OOM as non-zero.