Idle Runtime Cleanup Hooks
Status: Open — design proposal (Phase 4, Agent Orchestrator Research Program)
Problem
Section titled “Problem”A long-running agent container accumulates state outside the agent’s own working set: a Java agent leaves Gradle daemons holding onto a few GB of heap; a Node agent leaves npm caches with thousands of file descriptors; the runtime itself caches build artifacts indefinitely. After a few hours of an idle agent sitting open, that’s real resource cost — both on disk and in long-lived processes the operator never asked for.
multicode addresses this with declarative idle hooks: after N seconds of
inactivity, run a cleanup command (gradle --stop, etc.); optionally
recycle the container.
Why It Matters
Section titled “Why It Matters”- The autonomous queue (Phase 4) keeps containers warm waiting for work. Without cleanup, those warm containers accumulate state every cycle.
- It’s a small feature with disproportionate impact for long-running fleets — exactly the workflow the program targets.
- It generalizes naturally to anything an role wants done while idle (commit checkpoint snapshots, push WIP branches, evict caches).
Inspiration in multicode
Section titled “Inspiration in multicode”Sources:
- README — Autonomous queueing (the
idle-runtime-cleanup,idle-runtime-cleanup-delay-seconds,idle-runtime-cleanup-interval-seconds, andidle-runtime-restartknobs are documented here) - Config —
config.toml[autonomous]block
[autonomous]idle-runtime-cleanup = trueidle-runtime-cleanup-delay-seconds = 300 # idle for 5 min before first cleanupidle-runtime-cleanup-interval-seconds = 900 # re-run every 15 min while still idleidle-runtime-restart = false # also recycle the container?multicode runs gradle --stop and (optionally) terminates remaining
Gradle daemon/worker processes inside Apple-container workspaces. Then,
if idle-runtime-restart = true, recycles the runtime entirely (stops
the container, starts a new one).
The implementation watches the workspace status (their equivalent of
agent runtime status) and
fires when status has been Idle for the configured delay.
Recommended Shape
Section titled “Recommended Shape”Generalize the concept: roles declare what to run when idle, the operator decides whether to enable it, jackin’s runtime supervisor fires the hook based on observed status.
Role config
Section titled “Role config”version = "v1alpha2"
[runtime.idle]commands = [ "gradle --stop", "find /home/agent/.cache -atime +1 -delete"]delay_seconds = 300interval_seconds = 900restart_after_cleanup = falsecommands is a list — each runs in sequence inside the agent container
via docker exec. They’re the role’s recommendation; the
operator chooses whether to enable.
Operator opt-in
Section titled “Operator opt-in”# operator config[roles."the-architect"]enable_idle_hooks = true # default falseDefaults to off because:
- The hooks run inside someone else’s container based on the agent class’s declarations.
- Some operators want the warm state preserved.
- Misconfigured commands could break the running agent.
Trigger
Section titled “Trigger”The supervisor watches the agent runtime
status bus. When an instance
has been Idle for delay_seconds continuously:
- Run
commands[0],commands[1], … in sequence viadocker exec. Each command gets a 60-second timeout (configurable). - If
restart_after_cleanup = true, eject and re-load the instance afterward (preserves the data dir, just recycles the container). - Record the cleanup in the persistent storage layer’s
tool_historytable (or a dedicatedcleanup_history). - Re-arm: next cleanup fires
interval_secondslater if still idle.
A status transition out of Idle cancels the pending cleanup.
Console visibility
Section titled “Console visibility”The console resource panel (when open) shows “Last cleanup: 4m ago”
in the per-agent row. CLI: jackin status <selector> includes the
last-cleanup time.
Scope (V1)
Section titled “Scope (V1)”[runtime.idle]block onjackin.role.toml.- Operator opt-in via
enable_idle_hooksper role. - Idle detection from the status bus.
- Sequential command execution via
docker execwith per-command timeout. - Optional
restart_after_cleanuptoggle. - Console rendering of last-cleanup time.
- Cleanup history written to the persistent storage layer.
- Idle hooks based on resource thresholds (“cleanup when memory > X”) in addition to time-based. Defer.
- Hooks for other states (busy, question). Idle-only in V1.
- Per-workspace override of idle config. Agent-class-level only in V1.
- Operator-defined ad-hoc cleanup commands (not in role). Defer;
use
jackin execinstead if needed. - Notification on cleanup failure beyond a console toast. Defer.
Open Questions
Section titled “Open Questions”- Default delay/interval values. multicode uses 300s/900s. Are those sensible jackin defaults, or should roles be more conservative? Recommended: match multicode for V1; tune from feedback.
- Cleanup during foreground operator attach. If the operator is
actively
jackin hardline’d into an idle session, should cleanup fire? multicode runs it regardless. Recommended: suppress when attached — the operator’s about-to-type-something signal isn’t visible to the status adapter. - Recovery from cleanup-broken-container. If
restart_after_cleanupis true and the new container fails to come up, the operator’s session is unrecoverable from cleanup alone. Recommended: one retry, then mark instance failed and surface for operator intervention.
Related Files
Section titled “Related Files”- New module (e.g.
src/runtime/idle.rs) — supervisor src/manifest/mod.rs—[runtime.idle]schemasrc/runtime/launch.rs— wires supervisor into instance lifecycle- The persistent-storage module — cleanup history table
src/console/manager/state.rs— last-cleanup display
See Also
Section titled “See Also”- Agent Orchestrator Research Program
- Agent runtime status — required idle signal
- Autonomous task queue — primary beneficiary (queue-warm containers cleaned regularly)
- Console resource panel — visibility surface