Skip to main content

OpenClaw + tmux on Linux (Governed Orchestration)

This tutorial documents a practical version of the "agent swarm" workflow often shown on social media, adapted for AEEF controls and for small Linux servers.

It is useful when you want:

  • persistent long-running agent sessions over SSH
  • lightweight orchestration across multiple tasks
  • PR/CI-based monitoring without watching terminals
  • controlled concurrency on a resource-constrained VM

It is not a requirement for AEEF compliance. If you mainly run one task at a time, direct use of Codex or Claude Code may remain the better option.

What You Gain vs Direct Agent Use

Using an orchestrator (such as OpenClaw) with tmux gives operational benefits more than model-quality benefits:

  • Persistence -- tasks continue if your SSH connection drops
  • Parallel task handling -- run multiple isolated worktrees (within VM limits)
  • Mid-task steering -- inject corrective instructions into an active tmux session
  • Deterministic monitoring -- check PR/CI status from a task registry instead of polling model sessions
  • Repeatability -- the same launcher, task metadata, and "definition of done" for every task

Direct use of Codex/Claude Code is still better when:

  • the task is high-risk and needs close supervision
  • the repository is small and changes are infrequent
  • you do not want to operate background orchestration scripts
  • your infrastructure cannot safely isolate agent permissions

AEEF Control Mapping (Why This Tutorial Is Safe to Adopt)

This workflow is only acceptable under AEEF when combined with explicit controls:

ConcernRequired ControlAEEF Reference
Autonomous changes merging silentlyHuman review before mergePRD-STD-002
Agent sprawl / hidden permissionsAgent contracts and bounded permissionsPRD-STD-009
Broken code shipping fasterCI quality gates define completionPRD-STD-007, PRD-STD-003
Unsafe dependencies or scans skippedSecurity + dependency scanning in CIPRD-STD-004, PRD-STD-008
Runaway cost and compute usageConcurrency caps, retries, and cost controlsPRD-STD-012

For a typical small Linux VM (for example 4 vCPU, 16 GB RAM, limited swap), use this pattern:

  • 1 active coding agent as the default
  • 2 agents max only after measuring RAM, swap, and CI runtime
  • No production secrets in the agent host environment
  • CI as source of truth for done/not done
  • Worktrees for task isolation instead of multiple clones

Do not copy social-media examples that disable approvals/sandboxing on a shared server.

Architecture (Minimal, Practical)

You (SSH / Telegram / Slack)
-> Orchestrator (OpenClaw or equivalent)
-> Task Registry (JSON)
-> Launcher Script
-> git worktree + branch
-> tmux session per task
-> Codex / Claude Code process
-> Monitor Loop (cron or systemd timer)
-> tmux session alive?
-> PR exists?
-> CI status passed?
-> Notify only when action needed

Step 1: Create a Local Orchestration Workspace

Add a repo-local folder for orchestration metadata and scripts:

mkdir -p .clawdbot scripts/agent-runner
touch .clawdbot/active-tasks.json
printf '[]\n' > .clawdbot/active-tasks.json

Recommended repo-local artifacts:

  • .clawdbot/active-tasks.json -- task registry (machine-readable state)
  • scripts/agent-runner/launch-task.sh -- worktree + tmux launcher
  • scripts/agent-runner/check-tasks.sh -- deterministic monitor
  • scripts/agent-runner/cleanup-tasks.sh -- worktree/session cleanup

Step 2: Define a "Definition of Done" That the Orchestrator Enforces

A PR is not "done" when it is merely opened. Require all of the following:

  1. Branch pushed and PR created
  2. Branch is mergeable / rebased onto current main
  3. CI passes (build, tests, lint, type checks)
  4. Security and dependency scans pass
  5. Human reviewer approval is still required before merge

If your repo has UI changes, add screenshot evidence as a PR requirement.

This avoids the common failure mode where orchestration increases output volume but lowers merge quality.

Step 3: Launch Each Task in a Worktree + tmux Session

Use one worktree and one tmux session per task:

# Example launcher flow
git fetch origin
git worktree add ../wt-doc-nav-fix -b feat/doc-nav-fix origin/main

tmux new-session -d -s "codex-doc-nav-fix" \
-c "/path/to/repo/../wt-doc-nav-fix" \
"codex --model gpt-5.3-codex 'Implement task from prompt file and open PR when CI passes'"

Why this pattern works:

  • worktrees isolate file changes and branch state
  • tmux keeps the session alive after disconnects
  • session names become stable task IDs for monitoring

Step 4: Track Task State in a Registry (Not in Human Memory)

Example .clawdbot/active-tasks.json entry:

[
{
"id": "doc-nav-fix",
"agent": "codex",
"tmuxSession": "codex-doc-nav-fix",
"repo": "aeef-standards",
"branch": "feat/doc-nav-fix",
"worktreePath": "../wt-doc-nav-fix",
"status": "running",
"startedAt": 1740268800000,
"notifyOnComplete": true,
"retries": 0
}
]

Store only operational metadata here. Do not store secrets, customer PII, or full prompts if they contain sensitive context.

Step 5: Monitor Deterministically (cron or systemd timer)

Your monitor loop SHOULD inspect state using platform tools rather than asking the model for status.

Checks to run:

  • tmux has-session -t <session> to confirm the process is still active
  • gh pr list --head <branch> to find the PR
  • gh pr checks <pr-number> to inspect CI state
  • retry limit handling for failed tasks
  • notification only when human intervention is required

Example cron schedule (every 10 minutes):

*/10 * * * * cd /path/to/repo && ./scripts/agent-runner/check-tasks.sh >> /var/log/agent-runner.log 2>&1

For production reliability on Linux, prefer a systemd timer over cron if available.

Step 6: Use tmux for Mid-Task Corrections (Without Restarting)

The most useful tmux feature in this workflow is redirection:

tmux send-keys -t codex-doc-nav-fix "Stop. Fix the broken sidebar links first. Do not change page copy yet." Enter
tmux send-keys -t codex-doc-nav-fix "Use sidebarsProduction.ts as the source of truth for ordering." Enter

This improves completion rate without discarding useful progress.

Step 7: Apply AEEF Guardrails to Agent Permissions

If you adopt OpenClaw (or any orchestrator), separate orchestration privileges from coding-agent privileges:

  • Orchestrator may read task metadata, CI status, and project notes
  • Coding agents should only have repo-scoped credentials needed for their task
  • No agent should have production admin APIs by default
  • No coding agent should have direct write access to production databases
  • Secrets should be injected via least-privilege tokens and rotated regularly

For higher assurance, run coding agents in containers or isolated runners rather than directly on the same host that stores sensitive operational credentials.

Step 8: Start with a Single-Agent Baseline

Do this before attempting "swarm" behavior:

  1. Implement one launcher script.
  2. Implement one monitor script.
  3. Prove a single task can go from prompt to PR to green CI.
  4. Measure RAM, swap, and completion time.
  5. Add a second concurrent agent only if the host remains stable.

This stepwise rollout aligns with AEEF's risk-based adoption approach and prevents infrastructure instability from being mistaken for agent failure.

Capacity Guidance for Small Linux VMs (Conservative)

For a small documentation or web app repo on a 4 vCPU / 16 GB RAM Linux VM:

  • 1 agent (recommended default) -- stable for most teams
  • 2 agents (conditional) -- only if builds/tests are lightweight and swap use stays low
  • 3+ agents -- usually not worth it without more RAM and stronger isolation

Watch:

  • swap growth over time
  • disk space consumed by multiple worktrees and dependencies
  • CI duplication (running heavy local validation plus remote CI)
  • token/cost spikes from retries

When to Stay Direct (Use Codex/Claude Without Orchestration)

Stay direct when:

  • you are debugging a critical production issue
  • the change touches auth, billing, or data deletion logic
  • you need active reasoning and fast back-and-forth with the model
  • your team has not yet implemented CI gates and PR review discipline

Orchestration multiplies both good and bad process quality. Put controls in place first.

Implementation Checklist

  • git worktree + tmux launcher script
  • task registry JSON
  • monitor loop with gh checks
  • retry policy with max attempts
  • CI-backed definition of done
  • human review before merge
  • least-privilege tokens
  • concurrency cap based on host RAM/swap