Audit
You coordinate one or more codebase audits. Ask scoping questions one at a time, then run all agents in parallel without further user interaction.
$ARGUMENTS is optional context — specific concerns, repo path, or which audits to run.
Process
Step 1: Select Audits
Ask the user which audits to run. This is always the first and only question in the first message.
text
1Which audits should I run?
2
3A) All three (health + eval + docs)
4B) Code evaluation — 12-pillar scoring across 3 lenses
5C) Technical debt — audit across 4 vectors
6D) Documentation — drift detection across 6 phases
If $ARGUMENTS already specifies which audits (e.g., "/audit all"), skip this question and proceed to Step 2.
Wait for the user's answer before continuing.
Step 2: Ask Follow-Up Questions One at a Time
Based on which audits were selected, ask the relevant scoping questions one per message. Wait for each answer before asking the next.
Start with the universal question, then ask audit-specific questions.
Universal (always ask first):
- Known pain points — gives all auditors a starting hypothesis instead of scanning cold.
text
1Are there parts of the codebase you already know are problematic?
2Things that keep breaking, areas you dread touching, modules that slow down every PR.
3
4A) Yes (tell me which areas and what's wrong)
5B) No — scan everything with fresh eyes
If eval selected (B or A):
The code evaluation runs 3 evaluator agents in parallel, each scoring 4 pillars (12 total). The scores calibrate to the role level you select.
- Role level — sets the scoring bar. A "Senior" evaluation expects production-hardened patterns; a "Junior" evaluation focuses on fundamentals.
text
1What role level should I evaluate this codebase against?
2
3A) Junior Developer — fundamentals: readability, basic error handling, test presence
4B) Mid-Level Developer — patterns: separation of concerns, consistent conventions, test coverage
5C) Senior Developer — production: defensive coding, observability, performance awareness, type rigor
6D) Staff+ / Principal — systems: architectural coherence, scalability, operational excellence
- Focus areas — narrows what the evaluators pay extra attention to. They still score all 12 pillars regardless.
text
1Any specific concerns the evaluators should weight more heavily?
2
3A) Performance — hot paths, algorithmic complexity, resource management
4B) Security — input validation, auth patterns, secrets handling
5C) Testing — coverage quality, test architecture, edge cases
6D) Architecture — separation of concerns, modularity, coupling
7E) Multiple (tell me which)
8F) None — balanced evaluation across all pillars
- Scope and exclusions — what to evaluate and what to skip.
text
1What should the evaluators look at?
2
3A) Full repo, standard exclusions (vendor, generated, node_modules, __pycache__)
4B) Full repo, no exclusions
5C) Specific directories only (tell me which to include or exclude)
- Pillar overrides — by default, the pipeline remediates until all 12 pillars hit 9/10. Some pillars (like Creativity) may not be improvable through code changes. Override lets you set a lower threshold or exclude a pillar from the remediation gate entirely.
The 12 pillars are:
- Hire lens: Problem-Solution Fit, Architecture, Code Quality, Creativity
- Stress lens: Pragmatism, Defensiveness, Performance, Type Rigor
- Day 2 lens: Test Value, Reproducibility, Git Hygiene, Onboarding
text
1Any pillars to accept below the default 9/10 threshold?
2
3A) None — require 9/10 on all 12 pillars
4B) Specific overrides (tell me which pillars and target scores, e.g., "Creativity: 7, Git Hygiene: accept")
If health selected (C or A):
The health audit scans for technical debt across 4 vectors: architectural, structural, operational, and code hygiene. Findings are prioritized by severity (CRITICAL > HIGH > MEDIUM > LOW). The pipeline remediates until all CRITICAL and HIGH findings are resolved.
- Goal — determines which debt vectors the auditor emphasizes.
text
1What's the primary goal for this audit?
2
3A) General health check — scan all 4 vectors equally
4B) Production hardening — emphasize operational debt (error handling, timeouts, resource leaks, observability)
5C) Onboarding prep — emphasize structural and hygiene debt (naming, dead code, documentation, test coverage)
6D) Pre-release cleanup — focus on CRITICAL/HIGH items only, skip MEDIUM/LOW
- Deployment target — changes what "operational debt" means. A Lambda function has different concerns than a long-running container.
text
1What's the deployment target?
2
3A) Serverless (Lambda, Cloud Functions) — cold starts, execution limits, stateless constraints
4B) Containers (ECS, Kubernetes, Docker) — resource management, health checks, graceful shutdown
5C) Static hosting / SPA — build pipeline, CDN, client-side concerns
6D) Monolith / traditional server — process management, connection pooling, memory leaks
7E) Multiple (tell me which)
8F) Not deployed yet / unsure
- Scope and constraints — what to audit and what's off-limits, in one question.
text
1What should the health auditor cover, and is anything off-limits?
2
3A) Full repo, no constraints
4B) Full repo, but skip specific areas (tell me which — e.g., "don't touch the legacy auth module")
5C) Specific directories only (tell me which)
- Existing tooling — helps the fortifier (hardening phase) know what guardrails already exist so it doesn't duplicate work.
text
1What development tooling is already in place?
2
3A) Full setup — linters, CI pipeline, pre-commit hooks, type checking
4B) Partial (tell me what you have — e.g., "ESLint but no CI")
5C) None — no linting, CI, or hooks configured
If docs selected (D or A):
The doc audit runs 6 detection phases: discovery, comparison (drift/gaps/stale), code examples, link integrity, config/environment, and structure. It compares documentation claims against actual code behavior.
- Scope and constraints — what docs to audit and what's off-limits.
text
1What documentation should I audit, and is anything off-limits?
2
3A) All docs, no constraints
4B) All docs, but skip specific files (tell me which)
5C) Specific directories only (tell me which)
6D) README and API docs only
- Language stack — determines which auto-generation tools are available (typedoc for TS, sphinx for Python, swagger for REST APIs).
text
1What's the primary language stack?
2
3A) JS/TS — typedoc, swagger-jsdoc available
4B) Python — sphinx, mkdocstrings available
5C) Both
- Prevention tooling — what automated checks to add so documentation drift becomes a CI failure instead of a periodic cleanup.
text
1What drift prevention tooling should I add after fixing the docs?
2
3A) Markdown linting (markdownlint) + link checking (lychee) — catches formatting issues and broken links on every PR
4B) Auto-generated API docs (typedoc/sphinx) — single source of truth lives in code, not prose
5C) Both A and B
6D) None — just fix the existing docs, no new tooling
Step 3: Generate Plan Identifier
After all questions are answered, generate the directory name: YYYY-MM-DD-audit-slug
- Date: today's date
- Slug: short name for the repo (e.g.,
audit-ragstack, audit-my-app)
- Location:
docs/plans/YYYY-MM-DD-audit-slug/
Create the directory.
Step 4: Read Role Prompts
Before spawning agents, read all required role prompt files. Only read prompts for selected audits.
- If health selected: Read
skills/pipeline/health-auditor.md
- If eval selected: Read
skills/pipeline/eval-hire.md, skills/pipeline/eval-stress.md, skills/pipeline/eval-day2.md
- If docs selected: Read
skills/pipeline/doc-auditor.md
Step 5: Spawn All Agents in Parallel
All auditor/evaluator agents are read-only — they explore the codebase but don't modify it. Spawn all selected agents in a single parallel batch (up to 5 agents for "all"):
text
1+-------------------------------------------------------------------+
2| PARALLEL AGENT SPAWN |
3+-------------------------------------------------------------------+
4| |
5| health auditor ─┐ |
6| eval hire ──────┤ |
7| eval stress ────┤ all agents run simultaneously |
8| eval day2 ──────┤ |
9| doc auditor ────┘ |
10| ↓ |
11| orchestrator collects all responses, writes intake docs |
12| |
13+-------------------------------------------------------------------+
Agent 1: Health Auditor (if health selected)
xml
1<role_prompt>
2[Contents of health-auditor.md]
3</role_prompt>
4
5<task>
6Audit the codebase in the current working directory.
7Goal: [from Step 2]
8Scope: [from Step 2]
9Existing tooling: [from Step 2]
10Constraints: [from Step 2]
11</task>
Agent 2: Eval — The Pragmatist (if eval selected)
xml
1<role_prompt>
2[Contents of eval-hire.md]
3</role_prompt>
4
5<task>
6Evaluate the codebase in the current working directory.
7Role level: [from Step 2]
8Focus areas: [from Step 2]
9Exclusions: [from Step 2]
10</task>
Agent 3: Eval — The Oncall Engineer (if eval selected)
xml
1<role_prompt>
2[Contents of eval-stress.md]
3</role_prompt>
4
5<task>
6Evaluate the codebase in the current working directory.
7Role level: [from Step 2]
8Focus areas: [from Step 2]
9Exclusions: [from Step 2]
10</task>
Agent 4: Eval — The Team Lead (if eval selected)
xml
1<role_prompt>
2[Contents of eval-day2.md]
3</role_prompt>
4
5<task>
6Evaluate the codebase in the current working directory.
7Role level: [from Step 2]
8Focus areas: [from Step 2]
9Exclusions: [from Step 2]
10</task>
Agent 5: Doc Auditor (if docs selected)
xml
1<role_prompt>
2[Contents of doc-auditor.md]
3</role_prompt>
4
5<task>
6Audit documentation in the current working directory against codebase reality.
7Doc scope: [from Step 2]
8Constraints: [from Step 2]
9</task>
Step 6: Validate and Write Intake Docs
After all agents complete, verify each agent's output contains its completion signal:
- Health auditor: check for
AUDIT_COMPLETE
- Eval hire: check for
EVAL_HIRE_COMPLETE
- Eval stress: check for
EVAL_STRESS_COMPLETE
- Eval day2: check for
EVAL_DAY2_COMPLETE
- Doc auditor: check for
DOC_AUDIT_COMPLETE
If any signal is missing, the agent may have been truncated. Report the incomplete agent to the user and do NOT write that intake doc with partial data. Other intake docs with valid signals can still be written.
For agents with valid signals, write the intake docs:
- Health: Write
docs/plans/YYYY-MM-DD-audit-slug/health-audit.md with type: repo-health in frontmatter
- Eval: Combine all 3 evaluator outputs into
docs/plans/YYYY-MM-DD-audit-slug/eval.md with type: repo-eval and pillar_overrides in frontmatter
- Docs: Write
docs/plans/YYYY-MM-DD-audit-slug/doc-audit.md with type: doc-health in frontmatter
See the individual intake skill SKILL.md files (repo-health, repo-eval, doc-health) for the exact output templates.
Step 7: Log to Manifest
Append an entry to .claude/skill-runs.json in the repo root. If the file does not exist, create it with an empty array first. Each entry records when a skill was run so that skill usage can be tracked across repos and OS wipes.
json
1{
2 "skill": "audit",
3 "date": "YYYY-MM-DD",
4 "plan": "YYYY-MM-DD-audit-slug",
5 "audits": ["health", "eval", "docs"]
6}
audits: list which audits were selected (subset of health, eval, docs)
- Read the existing file, parse the JSON array, append the new entry, and write it back
- If the file is malformed, overwrite it with a fresh array containing only the new entry
Step 8: Handoff
text
1Audit complete: docs/plans/YYYY-MM-DD-audit-slug/
2
3Intake docs produced:
4- [health-audit.md — X critical, Y high, Z medium, W low]
5- [eval.md — N/12 pillars at target]
6- [doc-audit.md — X drift, Y gaps, Z stale, W broken links]
7
8To remediate, run:
9/pipeline YYYY-MM-DD-audit-slug
10
11The pipeline will create one unified plan across all audit types.
Rules
- DO ask the audit selection question first, alone
- DO ask follow-up questions one at a time, waiting for each answer
- DO NOT prompt the user again after all questions are answered — run all agents autonomously
- DO NOT start remediation — your only output is the intake docs
- DO NOT re-run evaluator or auditor agents after writing the intake docs — they run exactly once during this skill. Re-evaluation happens later in
/pipeline after all remediation is complete.
- DO embed role prompt contents in agent prompts (agents cannot access skill directory files)
- DO produce all intake docs in the same plan directory
- DO report results after each audit completes