OmniClaw Container Debugging
This guide covers debugging the containerized agent execution system.
Architecture Overview
Host (macOS / Linux) Container (Linux VM / Docker)
─────────────────────────────────────────────────────────────────
src/backends/local-backend.ts container/agent-runner/
│ │
│ spawns container via backend │ runs Claude Agent SDK
│ with volume mounts │ with MCP servers
│ │
├── data/env/env ──────────────> /workspace/env-dir/env
├── groups/{folder} ───────────> /workspace/group
├── data/ipc/{folder} ────────> /workspace/ipc
├── data/sessions/{folder}/.claude/ ──> /home/node/.claude/ (isolated per-group)
└── (main only) project root ──> /workspace/project
Important: The container runs as user node with HOME=/home/node. Session files must be mounted to /home/node/.claude/ (not /root/.claude/) for session resumption to work.
Log Locations
| Log | Location | Content |
|---|
| Main app logs | logs/omniclaw.log | Host-side messaging, routing, container spawning |
| Main app errors | logs/omniclaw.error.log | Host-side errors |
| Container run logs | groups/{folder}/logs/container-*.log | Per-run: input, mounts, stderr, stdout |
| Claude sessions | ~/.claude/projects/ | Claude Code session history |
Enabling Debug Logging
Set LOG_LEVEL=debug for verbose output:
bash
1# For development
2LOG_LEVEL=debug bun run dev
3
4# For systemd service (Linux), add to unit override:
5systemctl --user edit omniclaw
6# Add: Environment=LOG_LEVEL=debug
7
8# For launchd service (macOS), add to plist EnvironmentVariables:
9# <key>LOG_LEVEL</key>
10# <string>debug</string>
Debug level shows:
- Full mount configurations
- Container command arguments
- Real-time container stderr
Common Issues
1. "Claude Code process exited with code 1"
Check the container log file in groups/{folder}/logs/container-*.log
Common causes:
Missing Authentication
Invalid API key · Please run /login
Fix: Ensure .env file exists with either OAuth token or API key:
bash
1# Should show one of:
2# CLAUDE_CODE_OAUTH_TOKEN=sk-ant-oat01-... (subscription)
3# ANTHROPIC_API_KEY=sk-ant-api03-... (pay-per-use)
Root User Restriction
--dangerously-skip-permissions cannot be used with root/sudo privileges
Fix: Container must run as non-root user. Check Dockerfile has USER node.
2. Environment Variables Not Passing
The system extracts only authentication variables (CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_API_KEY) from .env and mounts them for sourcing inside the container.
To verify env vars are reaching the container:
bash
1# Docker
2docker run --rm \
3 -v $(pwd)/data/env:/workspace/env-dir:ro \
4 --entrypoint /bin/bash omniclaw-agent:latest \
5 -c 'export $(cat /workspace/env-dir/env | xargs); echo "OAuth: ${#CLAUDE_CODE_OAUTH_TOKEN} chars, API: ${#ANTHROPIC_API_KEY} chars"'
6
7# Apple Container
8echo '{}' | container run -i \
9 --mount type=bind,source=$(pwd)/data/env,target=/workspace/env-dir,readonly \
10 --entrypoint /bin/bash omniclaw-agent:latest \
11 -c 'export $(cat /workspace/env-dir/env | xargs); echo "OAuth: ${#CLAUDE_CODE_OAUTH_TOKEN} chars, API: ${#ANTHROPIC_API_KEY} chars"'
3. Mount Issues
To check what's mounted inside a container:
bash
1# Docker
2docker run --rm --entrypoint /bin/bash omniclaw-agent:latest -c 'ls -la /workspace/'
3
4# Apple Container (quirks: only mounts directories, -v doesn't support :ro)
5container run --rm --entrypoint /bin/bash omniclaw-agent:latest -c 'ls -la /workspace/'
Expected structure:
/workspace/
├── env-dir/env # Environment file (CLAUDE_CODE_OAUTH_TOKEN or ANTHROPIC_API_KEY)
├── group/ # Current group folder (cwd)
├── project/ # Project root (main channel only)
├── global/ # Global CLAUDE.md (non-main only)
├── ipc/ # Inter-process communication
│ ├── messages/ # Outgoing messages
│ ├── tasks/ # Scheduled task commands
│ ├── current_tasks.json # Read-only: scheduled tasks visible to this group
│ └── available_groups.json # Read-only: groups for activation (main only)
└── extra/ # Additional custom mounts
4. Permission Issues
The container runs as user node (uid 1000). All of /workspace/ and /app/ should be owned by node.
5. Session Not Resuming / "Claude Code process exited with code 1"
Root cause: The SDK looks for sessions at $HOME/.claude/projects/. Inside the container, HOME=/home/node, so it looks at /home/node/.claude/projects/.
Fix: Ensure local-backend.ts mounts to /home/node/.claude/:
typescript
1mounts.push({
2 hostPath: claudeDir,
3 containerPath: '/home/node/.claude', // NOT /root/.claude
4 readonly: false,
5});
6. Service Stops After SSH Disconnect (Linux)
Systemd user services are killed when the last login session ends unless lingering is enabled.
Diagnose:
bash
1loginctl show-user $(whoami) | grep Linger
2# Linger=no means services die on logout
Fix:
bash
1loginctl enable-linger $(whoami)
This persists across reboots. The setup script (step 10) enables this automatically. If it fails with a permissions error, sudo or a polkit rule may be needed.
7. MCP Server Failures
If an MCP server fails to start, the agent may exit. Check the container logs for MCP initialization errors.
8. OpenCode Session Looks Healthy But Replies Are Empty
Symptom pattern in logs/omniclaw.log:
text
1[opencode-runtime] Created new session: ...
2[opencode-runtime] Injected system context
3[opencode-runtime] Sending prompt ...
4[opencode-runtime] extractResponseText: 0 parts, types:
5[opencode-runtime] waitForAssistantText: 3 messages, last role: assistant, last parts: 0, types:
Important: if this happens on a brand new session, the root cause is not stale session resume alone.
Known real-world case:
- Remote Linux OpenCode agent worked normally
- Local macOS OpenCode agent created fresh sessions but still returned empty assistant parts
- The local base OpenCode auth store contained mixed providers (
anthropic + openai)
- The working remote auth store contained only
openai
What to check:
bash
1# Session persistence
2sqlite3 store/messages.db "SELECT group_folder, session_id, created_at FROM sessions WHERE group_folder LIKE 'ocpeyton-discord__dispatch__%';"
3
4# Base OpenCode auth/data
5python3 - <<'PY'
6from pathlib import Path
7import json
8p = Path('data/opencode-data/ocpeyton-discord')
9for name in ['auth.json','mcp-auth.json','opencode.db']:
10 fp = p / name
11 print(name, fp.exists())
12 if fp.exists() and name.endswith('.json'):
13 print(sorted(json.loads(fp.read_text()).keys()))
14PY
Recovery order:
bash
1# 1. Stop omniclaw so files are not recreated mid-delete
2launchctl bootout gui/$(id -u)/com.omniclaw # macOS
3systemctl --user stop omniclaw # Linux
4
5# 2. Clear persisted dispatch sessions for the broken agent
6sqlite3 store/messages.db "DELETE FROM sessions WHERE group_folder LIKE 'ocpeyton-discord__dispatch__%';"
7
8# 3. Clear all dispatch OpenCode runtime stores for that agent
9find data/opencode-data -maxdepth 1 -type d -name 'ocpeyton-discord__dispatch__*' -print -exec rm -rf {} +
10
11# 4. Clear the agent's base OpenCode store if auth/provider state looks wrong
12rm -f data/opencode-data/ocpeyton-discord/auth.json \
13 data/opencode-data/ocpeyton-discord/mcp-auth.json \
14 data/opencode-data/ocpeyton-discord/opencode.db \
15 data/opencode-data/ocpeyton-discord/opencode.db-shm \
16 data/opencode-data/ocpeyton-discord/opencode.db-wal
17
18# 5. Start omniclaw again
19launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.omniclaw.plist # macOS
20systemctl --user start omniclaw # Linux
Notes:
- If the remote agent works and the local one does not, compare
auth.json provider keys first.
- Do not assume session corruption if the failure reproduces on a fresh session.
- OpenCode reasoning should not be sent to users; only user-facing text parts should be forwarded.
Service Management
bash
1# Check status
2launchctl list | grep omniclaw # macOS
3systemctl --user status omniclaw # Linux
4
5# Restart
6launchctl kickstart -k gui/$(id -u)/com.omniclaw # macOS
7systemctl --user restart omniclaw # Linux
8
9# Stop
10launchctl bootout gui/$(id -u)/com.omniclaw # macOS
11systemctl --user stop omniclaw # Linux
12
13# View live logs
14tail -f logs/omniclaw.log
15
16# Check running agent containers
17docker ps --filter name=omniclaw # Docker
18launchctl list | grep 'container-runtime-linux.omniclaw' # Apple Container
19
20# Rebuild after code changes
21bun run build
22# Then restart service (see above)
Manual Container Testing
Test the full agent flow:
bash
1# Set up env file
2mkdir -p data/env groups/test
3
4# Run test query (Docker)
5echo '{"prompt":"What is 2+2?","groupFolder":"test","chatJid":"test@g.us","isMain":false}' | \
6 docker run -i --rm \
7 -v $(pwd)/data/env:/workspace/env-dir:ro \
8 -v $(pwd)/groups/test:/workspace/group \
9 -v $(pwd)/data/ipc:/workspace/ipc \
10 omniclaw-agent:latest
Interactive shell in container:
bash
1docker run --rm -it --entrypoint /bin/bash omniclaw-agent:latest # Docker
2container run --rm -it --entrypoint /bin/bash omniclaw-agent:latest # Apple Container
SDK Options Reference
The agent-runner uses these Claude Agent SDK options:
typescript
1query({
2 prompt: input.prompt,
3 options: {
4 cwd: '/workspace/group',
5 allowedTools: ['Bash', 'Read', 'Write', ...],
6 permissionMode: 'bypassPermissions',
7 allowDangerouslySkipPermissions: true, // Required with bypassPermissions
8 settingSources: ['project'],
9 mcpServers: { ... }
10 }
11})
Important: allowDangerouslySkipPermissions: true is required when using permissionMode: 'bypassPermissions'. Without it, Claude Code exits with code 1.
Rebuilding After Changes
bash
1# Rebuild main app
2bun run build
3
4# Rebuild container
5./container/build.sh
6
7# Force clean rebuild (Docker)
8docker builder prune -f && ./container/build.sh
9
10# Force clean rebuild (Apple Container — flush builder cache)
11container builder stop && container builder rm && container builder start
12./container/build.sh
Session Persistence
Claude sessions are stored per-group in data/sessions/{group}/.claude/ for security isolation.
Critical: The mount path must match the container user's HOME directory:
- Container user:
node
- Container HOME:
/home/node
- Mount target:
/home/node/.claude/ (NOT /root/.claude/)
To clear sessions:
bash
1# Clear all sessions for all groups
2rm -rf data/sessions/
3
4# Clear sessions for a specific group
5rm -rf data/sessions/{groupFolder}/.claude/
6
7# Also clear the session ID from OmniClaw's tracking (stored in SQLite)
8sqlite3 store/messages.db "DELETE FROM sessions WHERE group_folder = '{groupFolder}'"
Discord Multi-Bot Routing
Internal bot key vs Discord snowflake ID
OmniClaw uses two completely different identifiers for Discord bots — they must not be confused:
| Identifier | Where it lives | Example | Used for |
|---|
| Internal bot key | DISCORD_BOT_IDS in .env | PRIMARY, OCPEYTON | OmniClaw routing |
| Discord snowflake ID | Discord Developer Portal → App → General | 1476396931709276191 | Discord's own API |
The internal bot key must match exactly across:
DISCORD_BOT_IDS=PRIMARY,OCPEYTON
DISCORD_BOT_<KEY>_TOKEN=<token> (e.g., DISCORD_BOT_OCPEYTON_TOKEN)
channel_subscriptions.discord_bot_id in SQLite
Common mistake: Using the numeric Discord snowflake ID in channel_subscriptions.discord_bot_id instead of the human-readable key. This breaks routing silently.
Diagnosing wrong-bot-sending issues
bash
1# Check what keys are configured in env
2grep DISCORD_BOT_IDS .env
3
4# Check what's stored in the DB (should match keys above, NOT numeric IDs)
5sqlite3 store/messages.db "SELECT DISTINCT discord_bot_id FROM channel_subscriptions WHERE discord_bot_id IS NOT NULL"
6
7# Fix if DB contains numeric IDs instead of keys
8sqlite3 store/messages.db "UPDATE channel_subscriptions SET discord_bot_id = 'OCPEYTON' WHERE discord_bot_id = '1476396931709276191'"
IPC Debugging
bash
1# Check pending messages
2ls -la data/ipc/messages/
3
4# Check pending task operations
5ls -la data/ipc/tasks/
6
7# Check available groups (main channel only)
8cat data/ipc/main/available_groups.json
9
10# Check current tasks snapshot
11cat data/ipc/{groupFolder}/current_tasks.json
Quick Diagnostic Script
bash
1echo "=== OmniClaw Diagnostic ==="
2
3echo -e "\n1. Service running?"
4if command -v systemctl &>/dev/null; then
5 systemctl --user is-active omniclaw 2>/dev/null && echo "OK (systemd)" || echo "NOT RUNNING"
6 echo -n " Linger: "; loginctl show-user $(whoami) 2>/dev/null | grep Linger || echo "unknown"
7elif command -v launchctl &>/dev/null; then
8 launchctl list 2>/dev/null | grep -q com.omniclaw && echo "OK (launchd)" || echo "NOT RUNNING"
9fi
10
11echo -e "\n2. Authentication configured?"
12[ -f .env ] && (grep -q "CLAUDE_CODE_OAUTH_TOKEN=sk-" .env || grep -q "ANTHROPIC_API_KEY=sk-" .env) && echo "OK" || echo "MISSING"
13
14echo -e "\n3. Container runtime?"
15if command -v docker &>/dev/null; then
16 docker info &>/dev/null && echo "OK (Docker)" || echo "Docker installed but not running"
17elif command -v container &>/dev/null; then
18 container system status &>/dev/null && echo "OK (Apple Container)" || echo "Apple Container not running"
19else
20 echo "No container runtime found"
21fi
22
23echo -e "\n4. Container image?"
24docker images omniclaw-agent:latest --format "OK ({{.Size}})" 2>/dev/null || \
25 (echo '{}' | container run -i --entrypoint /bin/echo omniclaw-agent:latest "OK" 2>/dev/null) || \
26 echo "MISSING — run ./container/build.sh"
27
28echo -e "\n5. Recent errors?"
29grep -E '"level":"error"' logs/omniclaw.log 2>/dev/null | tail -3 || echo "No error log"
30
31echo -e "\n6. Groups loaded?"
32grep 'groupCount' logs/omniclaw.log 2>/dev/null | tail -1 || echo "No log data"