What is mesh-builder?

Perfect for AI Agents needing configurable and reproducible behavior with steering and guiderails, utilizing the Claude Code SDK. mesh-builder is a configurable agent mesh skill for building reproducible AI behavior with steering and guiderails, following a core philosophy of minimal complexity.

How do I install mesh-builder?

Run the command: npx killer-skills add eighteyes/tx. It works with Cursor, Windsurf, VS Code, Claude Code, and 15+ other IDEs.

What are the use cases for mesh-builder?

Key use cases include: Building meshes for TX V4 with specific config options like dev_mode, Creating reproducible AI behavior with steering and guiderails, Optimizing agent workflows by adding only necessary complexity.

Which IDEs are compatible with mesh-builder?

This skill is compatible with Cursor, Windsurf, VS Code, Claude Code, GitHub Copilot, JetBrains, Cline, Roo Code, and many more. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for mesh-builder?

Requires Claude Code SDK. Specifically designed for TX V4. Adheres to 'Earn Your Keep' philosophy, which may limit unnecessary config options.

Mesh Builder

Name: mesh-builder
Availability: InStock
Rating: 3.0 (9 reviews)
Author: eighteyes

Build meshes (agent workflows) for TX V4.

Core Philosophy: Earn Your Keep

Start with the minimum. Add only what the mesh actually needs.

Every config option added is complexity that can break. The default question for each field is: "Does this mesh fail without it?" If no — leave it out.

Config Options and When They're Warranted

Option	Warranted when...	Default
`dev_mode: true`	Testing a new mesh end-to-end before committing to full model costs	Omit
`fsm:`	Routing depends on computed state/counters/file presence — NOT agent judgment	Omit
`parallelism:`	Agents truly run in parallel and need a sync gate	Omit
`routing_mode: dispatcher`	Fan-out to N parallel workers is the core mechanic	Omit
`type: persistent` / `auto_despawn: false`	Mesh must survive indefinitely (daemon pattern)	Omit
`continuation: false`	You explicitly need cold starts for isolation	Omit (continuation is default-on)
`lifecycle:` hooks	Quality gates or auto-commits are genuinely required	Omit
`workspace:`	Agents need structured file workspace management	Omit
`checkpoint:` / `fork_from:`	Multiple agents need shared prior context	Omit
`load:`	Files must be in context before any work starts	Omit
`ensemble:`	Same task, multiple perspectives, aggregated output	Omit
`injectOriginalMessage:`	Downstream agents truly need the original task	Omit
`rearmatter:`	FSM routing depends on self-assessment scores	Omit
`guardrails:`	Custom limits differ from system defaults	Omit

The minimal working mesh:

yaml
1mesh: my-mesh
2description: "What it does"
3agents:
4  - name: worker
5    model: sonnet
6    prompt: worker.md
7entry_point: worker

This is complete. Everything else is optional and should be justified.

Common Over-Engineering Patterns to Avoid

Orchestrator doing implementation work — add orchestrator: true to enforce routing-only (Read + Write msgs-only). System-level enforcement beats prompt instructions.
FSM + orchestrator that handles all routing — pick one. If orchestrator routes, drop FSM.
type: persistent on a mesh that runs once — persistent is for daemons only.
lifecycle: hooks "just in case" — add when quality gates are actually required.
parallelism: on 2 agents — just route them sequentially, parallelism overhead isn't worth it.
checkpoint:/fork_from: when agents don't share context — preloading context agents don't need wastes tokens.
Ensemble with 1 agent — that's just a regular agent.

Dev Mode — Cheap Workflow Testing

yaml
1dev_mode: true  # Forces ALL agents to haiku regardless of config

Enable when: You've built a new mesh and want to test the routing, workflow, and agent coordination end-to-end before paying for sonnet/opus runs. Haiku is fast and cheap — use it to validate the plumbing works before the real thing.

Enable for:

First run of any new mesh (always test with dev_mode first)
Debugging routing issues (agent A → B → C flow)
Validating FSM state transitions
Testing ask-human and HITL flows
Any mesh with 4+ agents where a full run would be expensive

Disable when:

The workflow is confirmed working and you need quality output
Haiku's reasoning is insufficient for the task (complex synthesis, conflict resolution, architectural decisions)
Running in production

Never commit dev_mode: true — it's a testing flag. Remove it before the mesh is considered production-ready. If you see it in a mesh config, that mesh hasn't been signed off yet.

yaml
1# ✅ Testing a new mesh
2dev_mode: true
3agents:
4  - name: synthesizer
5    model: opus   # ignored — all agents become haiku in dev_mode
6
7# ✅ Production — remove dev_mode entirely
8agents:
9  - name: synthesizer
10    model: opus   # now respected

Quick Start

bash
1# Test prompt output before deploying
2tx prompt <mesh> <agent>              # View built prompt with injected protocol
3tx prompt narrative-engine narrator   # Example
4tx prompt dev --raw                   # Raw output, no metadata

Documentation

Topic	Location
Config fields	`docs/mesh-config.md`
FSM (state tracking)	`.ai/docs/mesh-fsm-config.md`
Available meshes	`docs/meshes.md`
Message format	`docs/message-format.md`

Minimal Config

yaml
1mesh: example
2description: "What this mesh does"
3
4agents:
5  - name: worker
6    model: sonnet       # opus | sonnet | haiku
7    prompt: prompt.md
8
9entry_point: worker

Command Agents

Agents can invoke slash commands instead of (or in addition to) prompt files. The command is prepended to the user prompt when processing messages.

yaml
1agents:
2  - name: builder
3    model: opus
4    command: "/know:build"
5    prompt: builder/prompt.md  # optional extra context
6
7  - name: reviewer
8    model: sonnet
9    command: "/know:review"
10    # no prompt needed - command expands to full workflow

Precedence:

Message frontmatter command: (highest)
Agent config command: (default)
No command (just prompt as system prompt)

Requires settingSources: ['project'] (already enabled by default).

Command Template Interpolation

Commands support {key} template tokens that resolve from the message payload at runtime. Use this to pass dynamic values (like feature names) through the mesh pipeline.

yaml
1agents:
2  - name: prebuild
3    model: haiku
4    command: "/know:prebuild {feature}"   # {feature} replaced from payload
5
6  - name: builder
7    model: opus
8    command: "/know:build {feature}"      # same token, resolved per-message

Resolution rules:

{key} matches msg.payload[key] — if present, replaced with the string value
Unresolved tokens stay as literal text (no silent failures, no crashes)
Payload values come from message frontmatter (e.g., feature: auth-system)

Propagation: Upstream agents must include the key in their completion message frontmatter for downstream agents to receive it. The consumer maps frontmatter fields to payload automatically.

User message:  feature: auth  → prebuild gets "/know:prebuild auth"
Prebuild msg:  feature: auth  → builder gets "/know:build auth"

Writing Prompts

Focus on workflow only.

System Auto-Injects (DO NOT WRITE IN PROMPTS):

❌ Message protocol (frontmatter schema, message types, paths format)
❌ Routing instructions (how to write messages to other agents)
❌ Rearmatter format (success_signal, grade, confidence fields)
❌ Workspace structure and paths (auto-injected from config.yaml)
❌ Message file naming conventions
❌ Tool availability and usage instructions (system provides)

Dispatcher-Mode Prompt Examples (CRITICAL)

In routing_mode: dispatcher meshes, prompt examples must use the sentinel address (mesh/dispatch), never direct agent addresses. The system auto-injects routing instructions, but if your prompt includes message examples they must match the dispatcher protocol or agents will bypass the sentinel and trigger routing errors.

Fan-out discuss examples — always use sentinel + outcome: discuss + route_to::

markdown
1---
2to: my-mesh/dispatch
3from: my-mesh/reader-a
4outcome: discuss
5route_to: reader-b
6msg-id: discuss-{timestamp}
7headline: Question for reader-b
8timestamp: {iso-timestamp}
9---

Never write to: my-mesh/reader-b directly — this bypasses the dispatcher and the message gets dropped with a routing error nudge.

Completion examples — same pattern:

markdown
1---
2to: my-mesh/dispatch
3from: my-mesh/reader-a
4outcome: complete
5msg-id: report-{timestamp}
6headline: Domain report
7timestamp: {iso-timestamp}
8---

Write ONLY:

✅ Agent role and mandate
✅ Workflow steps (what to do, when)
✅ Decision trees and logic
✅ Domain-specific guidance
✅ Quality gates and success criteria

markdown
1# {Agent Name}
2
3You are the {role} agent.
4
5## Workflow
61. Read incoming task
72. {Work steps}
83. Signal completion when finished

Prompt Template Tokens

Prompts can embed {key} template tokens that are replaced with resolved values at runtime, before any section injection. This lets agents reference dynamic paths inline rather than relying on injected context sections.

Built-in tokens (always available when workspace is resolved):

{workspace} → absolute path to the resolved workspace directory

Example usage in prompt:

markdown
1## Phase 0: Inventory
2ls {workspace}/prose-draft.md
3cat {workspace}/context.yaml

At runtime, if workspace resolves to /project/.ai/games/my-game/campaigns/campaign-1/turns/turn-35, the prompt becomes:

markdown
1## Phase 0: Inventory
2ls /project/.ai/games/my-game/campaigns/campaign-1/turns/turn-35/prose-draft.md
3cat /project/.ai/games/my-game/campaigns/campaign-1/turns/turn-35/context.yaml

Rules:

Tokens that don't appear in the prompt are no-ops (safe for all meshes)
Unresolved tokens (no matching key) are left as-is (no silent failures)
Replacement happens via PromptInjector.replaceTemplateTokens() before workspace section injection
The injectWorkspace() method automatically replaces {workspace} — no caller changes needed

Dynamic workspace resolution via workspace.variables + workspace.locations:

When the workspace config declares variables and locations, the dispatcher resolves template variables from a source file (e.g., session.yaml) and uses the resolved workspace location as the workspace directory. This enables per-turn or per-session dynamic paths.

yaml
1workspace:
2  path: ".ai/games/"               # Static fallback
3  variables:
4    source: ".ai/tx/my-mesh/session.yaml"  # Fixed path (no chicken-and-egg)
5    mapping:
6      game-id: game_id             # {game-id} → session.game_id
7      campaign-id: campaign_id     # {campaign-id} → session.campaign_id
8      N: turn                      # {N} → session.turn
9  locations:
10    session: ".ai/tx/my-mesh"
11    game: ".ai/games/{game-id}"
12    campaign: ".ai/games/{game-id}/campaigns/{campaign-id}"
13    workspace: ".ai/games/{game-id}/campaigns/{campaign-id}/turns/turn-{N}"

Resolution priority (dispatcher):

FSM context $workspace variable (highest — gates use this)
Resolved workspace location from manifest variables (per-turn path)
Static workspace config (workspace.path)
Default: .ai/tx/workspaces/<mesh-name>

Falls back gracefully: if the source file is missing or variables don't resolve, unresolved {tokens} remain and the static fallback is used instead.

Agent Boundaries (CRITICAL for Coordinators)

Haiku agents are eager helpers. Without explicit boundaries, they'll do work meant for other agents. Use <boundaries> blocks to constrain behavior.

Problem: A haiku coordinator sees domain context (file formats, workflow goals) and decides to "help" by doing the creative work itself instead of routing.

Solution: Explicit DO NOT / ONLY lists that name WHO does each task.

markdown
1<role>
2Route tasks. Validate state. Forward to specialists.
3You are a ROUTER. You do NOT create content.
4</role>
5
6<boundaries>
7DO NOT:
8- Write output files (worker does that)
9- Analyze input data (analyst does that)
10- Make domain decisions (specialist does that)
11- Read file contents beyond checking existence
12
13ONLY:
14- Read session state for routing decisions
15- Check file EXISTENCE (ls), never CONTENTS (cat)
16- Write routing messages to other agents
17- Write ask-human when blocked
18</boundaries>

Key principles:

State WHO does the forbidden work: "(worker does that)"
Separate existence checks from content reads
Add "If you find yourself doing X, STOP" guardrails
Keep domain knowledge minimal - coordinators route, they don't understand

Phase Coordinators Pattern

For complex pipelines, use one haiku coordinator per phase instead of one monolithic coordinator.

Problem: A single coordinator managing many phases accumulates too much context and state. It becomes complex, error-prone, and harder to debug.

Solution: Split into discrete phase coordinators, each with single responsibility.

Before (monolithic):

yaml
1agents:
2  - name: coordinator
3    model: haiku
4    prompt: coordinator/prompt.md  # 400 lines, manages 6 phases

After (phase-based):

yaml
1agents:
2  - name: entry
3    model: haiku
4    prompt: coordinator/entry.md        # Routes based on state
5
6  - name: init-coord
7    model: haiku
8    prompt: coordinator/init-coord.md   # Sets up workspace, routes to prep
9
10  - name: prep-coord
11    model: haiku
12    prompt: coordinator/prep-coord.md   # Fan-out/fan-in for prep agents
13
14  - name: work-coord
15    model: haiku
16    prompt: coordinator/work-coord.md   # Dispatches workers, routes to validate

Benefits:

Each coordinator has ~50-80 lines (vs 400+)
Single responsibility per agent
Easier to debug (which phase failed?)
State validation at phase boundaries
Boundaries are clearer per-phase

Pattern:

entry → phase-1-coord → phase-2-coord → ... → completion-coord
              ↓               ↓
         specialists     specialists

Each phase coordinator:

Receives task from previous coordinator
Does its ONE job (setup, dispatch, validate, etc.)
Updates shared session state
Routes to next coordinator

Shared state: Use session.yaml that all coordinators read/write. Each coordinator preserves ALL fields when updating.

Multi-Agent Routing

yaml
1routing:
2  agent-a:
3    complete:
4      agent-b: "Handoff reason"
5    blocked:
6      core: "Need intervention"

See docs/mesh-config.md for full routing reference.

Dispatcher Routing (Opt-in)

Centralized routing where agents write to a sentinel address and the dispatcher resolves targets from config.

yaml
1routing_mode: dispatcher
2routing:
3  agent-a: agent-b               # linear — always routes to agent-b
4  agent-b:                        # branch — outcome determines target
5    approved: agent-c
6    needs_work: agent-a
7    default: agent-c
8  # agent-c: (absent) = terminal agent → routes to core/core on complete

Fan-out / Fan-in: Array value with trailing options object for parallel dispatch:

yaml
1routing_mode: dispatcher
2routing:
3  planner: [reviewer-a, reviewer-b, reviewer-c, { discuss: true, complete: synthesizer, fan_in: batch }]

complete: agent — join agent, gated until all fan-out members send outcome: complete
discuss: true — members can peer-message via outcome: discuss + route_to: peer
fan_in: batch|queue|drain — controls how messages are delivered to the join agent (default: batch)
transform: summarize — optional haiku pre-pass to compress responses before delivery
Fan-out members get implicit routing (no individual entries needed)
Members send outcome: complete to signal done, outcome: discuss + route_to: for peer chat

Fan-in delivery modes (fan_in):

Mode	Behavior
`batch` (default)	Gate until all complete, deliver all responses in one combined message
`queue`	Current OAOM serial delivery (N cold worker starts)
`drain`	Deliver immediately; inject into running join worker via session resume

Transform (transform):

Value	Behavior
`summarize`	Haiku pre-pass compresses response(s) before delivery to join agent

fan_in	transform	Result
batch	—	Gate until all complete, deliver all in one worker
batch	summarize	Gate, haiku-compress all responses into one, deliver
queue	—	Serial OAOM (N cold starts)
queue	summarize	Each message haiku-compressed before its worker run
drain	—	Inject each response into running join worker
drain	summarize	Each response haiku-compressed then injected

Agents receive prompt instructions to write to: mesh/dispatch with outcome: in frontmatter. Override with route_to: for explicit targeting. Reserved outcome: escalate routes to human.

Fan-out members with discuss: true receive a peer list in their prompt. They use outcome: discuss + route_to: peer-name for peer-to-peer messaging within the group.

Type detection: string value = linear, object value = branch, array value = fan-out, absent = terminal.

Common Patterns

Session reuse (default behavior): continuation: true is the default — sessions persist naturally. Set continuation: false to force cold starts (needed for checkpoint/fork_from isolation).

Persistent mesh (no shutdown on complete): For meshes that loop perpetually and report status without dying:

yaml
1completion_agents:
2  - weaver
3stop_on_first_complete: false   # Completion signal is informational, mesh continues
4check_queue_on_complete: true   # (default) Queue-aware for future use

stop_on_first_complete	check_queue_on_complete	Behavior
true (default)	true (default)	Stop on complete, wait for queue to drain first
true	false	Stop immediately on complete (legacy behavior)
false	true	Informational complete, mesh continues running
false	false	True daemon mode, mesh never stops on complete

MCP tools only: toolRestriction: mcp-only

Quality hooks: Use explicit lifecycle: hooks for quality evaluation:

yaml
1lifecycle:
2  pre:
3    - quality:preflight
4  post:
5    - quality:checklist
6    - quality:rubric

FSM state tracking: fsm: block for system-managed state variables and logic. Only use when routing depends on computed state, counters, or file presence — not agent judgment. If an orchestrator handles all routing anyway, FSM is redundant. See FSM decision guide below.

Parallel execution: parallelism: block for fork/join semantics (see Parallel Execution section below), or ensemble: { type: parallel } for FSM states

CRITICAL - FSM Entry Routing: Entry agents in FSM ensemble meshes MUST fan out to ALL ensemble workers. FSM observes these messages to track state, but explicit routing triggers the workers.

yaml
1routing:
2  entry:
3    complete:
4      worker-1: "Spawn worker 1"  # ✅ CORRECT - Fan out to all workers
5      worker-2: "Spawn worker 2"
6      worker-3: "Spawn worker 3"
7      # core: "..."                # ❌ WRONG - Workers never spawn!

Parallel Mesh Instances: Spawn isolated, named instances of the same mesh for concurrent execution:

yaml
1---
2to: dev/worker
3from: core/core
4parallel: true
5mesh-id: auth-system
6---
7
8Implement user authentication.

parallel: true — Spawn new instance or route to existing one
mesh-id: <name> — Unique identifier for this instance
Each instance is isolated with its own state and session
Use mesh-id in follow-up messages to route to the same instance
Instance marked complete when completion agent sends status: complete
View instances: tx status shows running and completed instances

Isolation guarantees:

Session isolation: Each instance gets unique session key (meshName:meshId) that persists across follow-up messages
State isolation: Independent worker tracking, metrics, and FSM state per instance
Workspace isolation: Agents see same filesystem but track state separately
Cross-instance communication: Not supported - instances cannot message each other directly

Cleanup and lifecycle:

Instances persist in SQLite (parallel_instances table) with status running or completed
No automatic garbage collection - completed instances remain queryable via tx status
No instance limit enforced (guardrails planned but not yet implemented)
Routing to completed instances returns error to sender

When to use:

Multiple features being built in parallel by the same mesh
Concurrent tasks that shouldn't share state
Same workflow applied to different inputs (e.g., dev mesh building feature-a and feature-b simultaneously)

Example workflow:

bash
1# Core sends task to dev with unique mesh-id
2echo "---
3to: dev/worker
4from: core/core
5parallel: true
6mesh-id: feature-auth
7---
8Build authentication feature." > .ai/tx/msgs/$(date +%s)-core-core--dev-worker-$(date +%s%N | tail -c 6).md
9
10# Later, send follow-up to the same instance
11echo "---
12to: dev/worker
13from: core/core
14mesh-id: feature-auth
15---
16Update authentication to use JWT." > .ai/tx/msgs/$(date +%s)-core-core--dev-worker-$(date +%s%N | tail -c 6).md

Original task injection: injectOriginalMessage: true - Injects original task into downstream agents

Design documentation: playbook_notes: - Embed architectural rationale in config (replaces separate READMEs)

Self-assessment metadata: rearmatter: - Agent outputs self-assessment fields (grade, confidence, status) for FSM routing decisions

Lifecycle hooks: Auto-commit, brain insights, quality gates

yaml
1lifecycle:
2  post:
3    - commit:auto    # Auto-commit changes
4    - brain-update   # Document insights

Available hooks: worktree:create, commit:auto, brain-update, quality:*. See docs/mesh-config.md.

File Preload

Dump files into agent context before execution. Useful for preloading context without manual reads.

yaml
1agents:
2  - name: preloader
3    model: haiku        # Model defaults to haiku when load is set
4    prompt: prompt.md
5    load:
6      - "package.json"  # Exact file
7      - "*.md"          # Glob pattern
8      - "src/**/*.ts"   # Recursive glob

Behavior:

Files matched by glob patterns are read and injected into system prompt
Files over 200KB are skipped with warning
node_modules/ and .git/ are auto-excluded
Model defaults to haiku when load is set (cheap preloaders)

Use cases:

Virtual "setup" agents that preload project context
Checkpoint entry points that establish shared context
Cheap haiku agents that read files before expensive opus agents work

Session Forking

Share conversation context between agents via checkpoints.

yaml
1agents:
2  - name: setup
3    model: haiku
4    prompt: setup.md
5    load: ["package.json"]
6    checkpoint: true      # Save session for forking
7
8  - name: worker-a
9    model: sonnet
10    prompt: worker.md
11    fork_from: setup      # Fork from setup's checkpoint
12
13  - name: worker-b
14    model: opus
15    prompt: worker.md
16    fork_from: setup      # Same checkpoint, different agent

Behavior:

checkpoint: true saves the agent's sessionId on completion
fork_from: agent-name loads that checkpoint as the starting session
Forked agents continue from the checkpoint's conversation history
Works across models (haiku checkpoint → opus fork)

Use cases:

Skip redundant prework (preload once, fork many)
Share established context across parallel workers
Model escalation with preserved context

Parallel Execution

Fork from entry, run agents concurrently, join at exit.

yaml
1agents:
2  - name: preload
3    model: haiku
4    prompt: preload.md
5    load: ["package.json"]
6    # checkpoint: true auto-added
7
8  - name: analyst
9    model: sonnet
10    prompt: analyst.md
11    # fork_from: preload auto-added
12
13  - name: reviewer
14    model: sonnet
15    prompt: reviewer.md
16
17  - name: critic
18    model: sonnet
19    prompt: critic.md
20
21  - name: synthesizer
22    model: sonnet
23    prompt: synthesizer.md
24
25parallelism:
26  - agents: [analyst, reviewer, critic]
27    entry: preload        # Fork point (gets checkpoint: true)
28    exit: synthesizer     # Sync gate (waits for all)
29    timeout: 300000       # Optional: 5 min timeout
30    on_partial: continue  # continue | abort on partial failure

Flow:

preload (entry)
    │ checkpoint
    ├─────┼─────┐
    ▼     ▼     ▼
analyst reviewer critic  (parallel, forked from preload)
    │     │     │
    └─────┼─────┘
          ▼
    synthesizer (exit, gated until all complete)

Auto-wiring:

Entry agent gets checkpoint: true automatically
Parallel agents get fork_from: entry automatically
Exit agent is gated until ALL parallel agents complete

Routing: Parallel agents must route to exit agent:

yaml
1routing:
2  preload:
3    complete:
4      analyst: "Ready for analysis"
5  analyst:
6    complete:
7      synthesizer: "Analysis done"
8  reviewer:
9    complete:
10      synthesizer: "Review done"
11  critic:
12    complete:
13      synthesizer: "Critique done"
14  synthesizer:
15    complete:
16      core: "Synthesis complete"

vs FSM Ensemble:

Feature	`parallelism:`	FSM `ensemble:`
Fork context	Yes (checkpoint)	No
Result aggregation	No (just sync)	Yes (concat/vote/etc)
Gating	Exit gated	FSM state transition
Use case	Parallel work, shared context	Same task, multiple perspectives

FSM (State Tracking)

Add fsm: block to track state and provide context to agents.

IMPORTANT: If you use FSM, you must also define routing: configuration. Routes can exist without FSM, but FSM cannot exist without routes.

When to Use FSM — Decision Guide

Default assumption: do NOT use FSM. Pure message routing handles the majority of meshes cleanly. Only add FSM when the routing itself cannot be handled by agent judgment.

Use FSM when ALL of these are true:

Routing decisions depend on computed state, counters, or file presence — not agent judgment
State must survive between completely separate TX sessions (cold restart)
You need arithmetic operations to drive routing (e.g., turn: turn + 1, loop N times then exit)
Or you need deterministic gate conditions independent of agent output (e.g., "only proceed if file X exists")

Real examples that warrant FSM:

Narrative engine tracking turn number across days-long campaigns
Loop mesh that runs exactly N iterations before exiting (counter drives routing)
Ensemble with a gate that only fires when ALL N output files are present on disk
Multi-session workflow that must resume in the exact right state after a cold restart

Do NOT use FSM when:

An orchestrator agent already handles all routing decisions — if every state routes back to orchestrator anyway, the orchestrator IS the state machine. Drop the FSM.
The workflow is linear or near-linear: A → B → C → D
Loops can be tracked by the agent itself (validator counting its own attempts in the message body)
Fan-out/fan-in is the need — use dispatcher routing with fan-out array syntax instead
HITL is the need — use ask-human messages directly, no FSM state required
"What phase are we in?" can be answered by reading the last message — that's not a state machine problem

The key test: If you'd trust an orchestrator agent to route correctly based on incoming messages, you don't need FSM. FSM is for when routing must be mechanical and cannot rely on agent judgment.

Red flags that you're over-engineering with FSM:

Every FSM state has only one non-error exit that always fires (just use routing)
All states route back to an orchestrator agent anyway (orchestrator is doing the state management, FSM is redundant)
The FSM context tracks values the orchestrator could just pass in message bodies
You added FSM because the workflow "felt complex" — complexity alone is not a reason

Sequential workflow:

yaml
1fsm:
2  initial: init
3
4  context:
5    turn: 0
6    workspace: null
7
8  states:
9    init:
10      agents: [coordinator]
11      entry:
12        set:
13          turn: "$((turn + 1))"
14          workspace: "/path/to/turn-$turn"
15      exit:
16        default: awaiting_work
17
18    awaiting_work:
19      agents: [worker]
20      exit:
21        when:
22          - condition: signal == "PASS"
23            target: complete
24        default: awaiting_work
25
26  scripts: {}

Parallel workflow (ensemble):

yaml
1routing:
2  # Ensemble agents need explicit routing
3  rev-1:
4    complete:
5      synthesizer: "Review 1 complete"
6  rev-2:
7    complete:
8      synthesizer: "Review 2 complete"
9  rev-3:
10    complete:
11      synthesizer: "Review 3 complete"
12
13fsm:
14  initial: parallel_review
15
16  states:
17    parallel_review:
18      ensemble:
19        type: parallel          # Required: type inside ensemble block
20        agents: [rev-1, rev-2, rev-3]
21        aggregation: concat
22      exit:
23        set:
24          results: "$ENSEMBLE_OUTPUT"
25        default: synthesize
26
27  scripts: {}

Agents receive injected context:

markdown
1## FSM Context
2state: awaiting_work
3turn: 5
4workspace: /path/to/turn-5

See docs/mesh-fsm-config.md for:

Exit-based routing (when/run/default)
Ensemble states (parallel execution)
Self-loops and iteration tracking
Gates and validation

Documentation

playbook_notes in config.yaml (for maintainers)

Design rationale and architectural decisions
WHY the mesh is built this way
Alignment with methodologies/patterns
Not injected into prompts

Example:

yaml
1playbook_notes: |
2  This mesh implements the Ralph pattern from ClaytonFarr/ralph-playbook.
3  Uses layered quality refinement: haiku drafts, sonnet reviews, opus finalizes.

Task Distribution Pattern

Alternative to ensemble for splitting work across agents:

yaml
1task_distribution:
2  spawner: coordinator      # Agent that splits the task
3  subagents: [worker-1, worker-2, worker-3]
4  reviewer: synthesizer     # Agent that combines results
5  distribution_strategy: equal  # equal | weighted | adaptive | custom
6  subtask_count: 5          # Optional fixed count
7  timeout_ms: 300000        # 5 minute timeout
8  allow_partial_failure: true

When to use task_distribution vs ensemble:

Pattern	Task Distribution	Ensemble
Task	Split into parts	Same task
Agents	Different subtasks	Same analysis
Output	Combined portions	Aggregated views

Aggregation Strategies

For ensemble aggregation field:

Strategy	Description	Use Case
`concat`	Join all outputs	Comprehensive review
`deduplicate`	Remove duplicate findings	Code analysis
`voting`	Majority opinion wins	Consensus decisions
`consensus`	Require agreement	High-stakes choices
`custom`	Use custom prompt	Domain-specific

Deprecated Patterns

AVOID these patterns:

Pattern	Replacement	Reason
`state.type: ensemble`	`state.ensemble: { type: parallel }`	Old FSM syntax
`state.subtask: true`	Explicit ensemble routing	Implicit behavior
`workspace: "string"`	`workspace: { path: "..." }`	Object format preferred

Agent Config Fields

Field	Type	Required	Description
`name`	string	yes	Agent identifier
`model`	string	yes*	`opus` / `sonnet` / `haiku` (*defaults to `haiku` if `load` set, else `sonnet`)
`prompt`	string	one of prompt/command	Path to prompt file
`command`	string	one of prompt/command	Slash command (e.g., `/know:build`). Supports `{key}` interpolation from payload.
`workspace`	object	no	Per-agent workspace config
`mcpServers`	object	no	MCP server configurations
`description`	string	no	Agent documentation
`load`	array	no	Files to preload into context (globs supported)
`checkpoint`	boolean	no	Save session state on completion for forking
`fork_from`	string	no	Fork from another agent's checkpoint
`thinking`	boolean	no	Extended thinking (default: true). Set `false` to disable.
`max_turns`	number	no	API round-trip limit per invocation
`max_messages`	number	no	Outbound message limit per invocation
`orchestrator`	boolean	no	Restrict to Read + Write(msgs only). For coordinator agents that route, not implement.

Additional Config Fields

Field	Type	Description
`dev_mode`	boolean	Force all agents to haiku for cheap workflow testing. Remove before production.
`brain`	boolean	Enable brain-update insights
`capabilities`	array	Agent capability tags
`config`	object	Custom mesh-specific settings
`idle_timeout_minutes`	number/false	Idle timeout (false=disabled)
`clear-before`	boolean	Clear state before run
`turn_workspace`	object	Turn-based game workspace
`parallelism`	array	Parallel execution blocks (see Parallel Execution)
`persistence`	boolean/array	Session persistence across mesh runs
`routing_fallback`	string	DEPRECATED — use `guardrails.routing_error.routing_fallback`
`routing_retry_max`	number	DEPRECATED — use `guardrails.routing_error.routing_retry_max`
`manifest_enforcement`	object	Artifact validation settings
`max_mesh_messages`	number/object	Mesh-wide message cap (guardrail)
`autoInjectManifestFiles`	boolean	Auto-preload manifest reads (default: true)

Route Validation

Verify all ask relationships have matching ask-response routes back.

Rule: If agent A asks agent B, then B must have an ask-response route back to A.

Manual check:

List all ask relationships: A → asks → B
List all ask-response routes: B → responds-to → [X, Y, Z]
For each ask, verify the target can respond to the sender

Common mistakes:

Coordinator asks worker, but worker only responds to a different coordinator
Agent added to ask list but ask-response not updated
Indirect flows (A → B → C → A) mistaken for direct flows

Example mismatch:

yaml
1# validator asks fixer
2validator:
3  ask:
4    fixer: "Fix issues"
5
6# fixer responds to reviewer, NOT validator - BUG!
7fixer:
8  ask-response:
9    reviewer: "Fixes complete"  # ⚠ validator missing!

Intentional indirection (not a bug):

yaml
1# narrator → lint-coordinator → editor → narrator
2# lint-coordinator responds to editor, not narrator (by design)

Document intentional indirections in playbook_notes.

Prompt-to-Config Validation

Verify all agent references in prompts match agents defined in config.yaml.

Rule: Every to: mesh/agent in prompt examples must reference an agent that exists in the mesh's config.yaml.

Manual check:

bash
1# Extract agents from config
2yq '.agents[].name' meshes/{mesh}/config.yaml | sort > /tmp/agents.txt
3
4# Extract to: targets from prompts
5rg "to: {mesh}/[a-z-]+" meshes/{mesh} --type md -o --no-filename \
6  | sed 's/to: {mesh}\///' | sort | uniq > /tmp/targets.txt
7
8# Find mismatches
9comm -23 /tmp/targets.txt /tmp/agents.txt

Common mistakes:

Generic coordinator when mesh has phase coordinators (init-coord, render-coord, etc.)
Outdated agent names after refactoring
Copy-paste from other meshes with different agent names

Architectural principle:

Prompts should reference responsibilities, not agent names. Routing decisions (who handles what) belong in config.yaml, not prompts.

Pattern	Guidance
`to: mesh/specific-agent` in examples	Acceptable for illustrating message format
`to: {from: field}` dynamic routing	Preferred for ask-response patterns
Prose describing "send to agent X"	Move WHO to config, keep WHAT in prompt

Anti-pattern:

markdown
1# BAD: Hardcoded routing in prompt
2When done, send ask-response to COORDINATOR.

Better:

markdown
1# GOOD: Reference responsibility, config handles routing
2When done, send ask-response to the coordinator that sent the ask.
3# Config routing section defines which coordinator that is.

Guardrails

Unified runtime enforcement with strict/warning mode on every guardrail. Config: .ai/tx/data/config.yaml under guardrails:.

Mode (applies to all guardrails):

strict	warning	Result
false	true	Default — Allow + inject feedback
true	true	Block/kill + reason
true	false	Block/kill silently
false	false	Disabled

Write gate: Intercepts Write/Edit/NotebookEdit and Bash redirects to undeclared paths.
Read gate: Intercepts Read/Glob/Grep to undeclared paths.
Routing error: Corrective injection on bad targets (max retries: 3) + per-edge message caps (routing_retry_max / routing_fallback).
Artifact validation: Pre/post validation of agent outputs. Default: enabled, 2 retries.
Max messages/turns: Global or per-agent caps. Accept bare number or {strict, warning, limit} object.
Max mesh messages: Mesh-wide cap on total messages across all agents in a mesh run.
Max turns (warning mode): SDK limit bypassed, turns tracked manually, event emitted at threshold.
Parity: Always-on, non-configurable.

yaml
1guardrails:
2  write_gate:
3    strict: false
4    warning: true
5    kill_threshold: null
6  read_gate:
7    strict: false
8    warning: true
9    kill_threshold: null
10  routing_error:
11    strict: false
12    warning: true
13    max_retries: 3
14  artifact:
15    strict: false
16    warning: true
17    post_validation: true
18    pre_validation: true
19    max_retry: 2
20  max_messages:
21    strict: false
22    warning: true
23    limit: null
24  max_turns:
25    strict: false
26    warning: true
27    limit: null
28  max_mesh_messages:
29    strict: false
30    warning: true
31    limit: null
32  meshes:
33    my-mesh:
34      write_gate:
35        strict: true
36        kill_threshold: 5
37      agents:
38        my-agent:
39          write_gate:
40            strict: false
41            warning: true
42            kill_threshold: 10

Override chain: agent > mesh > global > hardcoded default. strict and warning resolve independently.

Gates activate automatically when manifest entries exist — no additional mesh config needed.

Full reference: docs/guardrails.md

Debugging

bash
1tx status    # Workers, queue
2tx msg       # Message viewer
3tx spy       # Real-time activity
4tx logs      # System logs

mesh-builder — how to use mesh-builder how to use mesh-builder, what is mesh-builder, mesh-builder alternative, mesh-builder vs claude code, mesh-builder install, mesh-builder setup guide, configurable agent meshes, reproducible AI behavior, TX V4 integration

About this Skill

Features

# Core Topics

↓ Quality Score

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for mesh-builder MCP Server

! Prerequisites & Limits

# Tags

Mesh Builder

Core Philosophy: Earn Your Keep

Config Options and When They're Warranted

Common Over-Engineering Patterns to Avoid

Dev Mode — Cheap Workflow Testing

Quick Start

Documentation

Minimal Config

Command Agents

Command Template Interpolation

Writing Prompts

System Auto-Injects (DO NOT WRITE IN PROMPTS):

Dispatcher-Mode Prompt Examples (CRITICAL)

Write ONLY:

Prompt Template Tokens

Agent Boundaries (CRITICAL for Coordinators)

Phase Coordinators Pattern

Multi-Agent Routing

Dispatcher Routing (Opt-in)

Common Patterns

File Preload

Session Forking

Parallel Execution

FSM (State Tracking)

When to Use FSM — Decision Guide

Documentation

Task Distribution Pattern

Aggregation Strategies

Deprecated Patterns

Agent Config Fields

Additional Config Fields

Route Validation

Prompt-to-Config Validation

Guardrails

Debugging

Related Skills

Looking for an alternative to mesh-builder or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

widget-generator

chat-sdk

zustand

data-fetching