What is eval-audit-review?

SoulMap AI: a content-first reflective companion with a curated Markdown knowledge base, Python detectors, and tooling to validate and bundle agent-ready skills.

How do I install eval-audit-review?

Run the command: npx killer-skills add tuanductran/soulmap-ai/eval-audit-review. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

Which IDEs are compatible with eval-audit-review?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

eval-audit-review

Install eval-audit-review, an AI agent skill for AI agent workflows and automation. Works with Claude Code, Cursor, and Windsurf with one-command setup.

SKILL.md

Readonly

Imported Repository Instructions

The section below is supporting source material from the upstream repository. Use the Killer-Skills review above as the primary decision layer.

Supporting Evidence

Eval audit review

Name: eval-audit-review
Availability: InStock
Author: tuanductran

Use this skill when the task is to inspect or improve the trustworthiness of SoulMap's eval system rather than only adding one more case.

Do not use this skill for

routine edits to evals/datasets/ as the main task, use eval-suite-maintainer
broad release consistency review, use release-readiness-review
Python-only cleanup with no eval question, use python-maintainer

Mission

Keep evals honest, source-backed, and useful against real failure modes instead of optimizing for easy green runs.

Sources to check first

evals/README.md
evals/datasets/
tests/contract/
tests/eval_regression/
src/soulmap/devtools/evals/
the source Markdown or Python files each eval claims to protect

What to look for

evals that pass because assertions are too loose
cases with no clear source backing in AGENTS.md, skills/, or templates/
wording checks that drift from runtime examples
evaluator logic that is brittle, fuzzy, or easy to satisfy accidentally
important failure modes that appear in code or docs but are not represented in datasets

Workflow

Identify the failure mode or product contract the eval is supposed to protect.
Check whether the current dataset, harness, and source files all describe the same thing.
Tighten assertions only where the behavior is actually important.
Prefer a few sharp cases over many noisy ones.
Add or update source_markers when confidence needs to be explicit.
Run the matching eval commands, then the closest pytest contracts.

Expected output

Findings

List the eval weaknesses first, especially loose assertions, stale source links, or blind spots.

Fixes

Summarize the dataset, harness, or contract changes that improved audit quality.

Validation

State which eval and pytest commands were run.

Definition of done

The audited eval surface should be:

harder to game accidentally
clearly tied back to real source files or runtime behavior
focused on meaningful failure modes
validated with the exact commands maintainers actually use

eval-audit-review — agent-framework eval-audit-review, soulmap-ai, community, agent-framework, ide skills, ai-agent, code-analysis, developer-tools, devtools, document-processing, Claude Code

# Core Topics

Killer-Skills Review

Pourquoi utiliser cette compétence

Meilleur pour

↓ Cas d'utilisation exploitables for eval-audit-review

! Sécurité et Limitations

Why this page is reference-only

Source Boundary

Browser Sandbox Environment

⚡️ Ready to unleash?

FAQ & Installation Steps

? Frequently Asked Questions

What is eval-audit-review?

How do I install eval-audit-review?

Which IDEs are compatible with eval-audit-review?

↓ How To Install

! Reference-Only Mode

Imported Repository Instructions

eval-audit-review

Eval audit review

Do not use this skill for

Mission

Sources to check first

What to look for

Workflow

Expected output

Findings

Fixes

Validation

Definition of done

Compétences associées

Looking for an alternative to eval-audit-review or another community skill for your workflow? Explore these related open-source skills.

openclaw-release-maintainer

widget-generator

flags

pr-review

eval-audit-review — agent-framework eval-audit-review, soulmap-ai, community, agent-framework, ide skills, ai-agent, code-analysis, developer-tools, devtools, document-processing, Claude Code

À propos de ce Skill

# Core Topics

Killer-Skills Review

Pourquoi utiliser cette compétence

Meilleur pour

↓ Cas d'utilisation exploitables for eval-audit-review

! Sécurité et Limitations

Why this page is reference-only

Source Boundary

Browser Sandbox Environment

⚡️ Ready to unleash?

FAQ & Installation Steps

? Frequently Asked Questions

What is eval-audit-review?

How do I install eval-audit-review?

Which IDEs are compatible with eval-audit-review?

↓ How To Install

! Reference-Only Mode

Imported Repository Instructions

eval-audit-review

Eval audit review

Do not use this skill for

Mission

Sources to check first

What to look for

Workflow

Expected output

Findings

Fixes

Validation

Definition of done

Compétences associées

Looking for an alternative to eval-audit-review or another community skill for your workflow? Explore these related open-source skills.

openclaw-release-maintainer

widget-generator

flags

pr-review