Perfect for AI Agents involved in human-AI collaboration needing structured quality evaluation for ADRs and OBPI decompositions. GovZero Kit: A Development Covenant for Human-AI Collaboration - gzkit is cognitive infrastructure for extended human-AI collaboration - a protocol that preserves human intent across agent context bou

How do I install gz-adr-eval?

Run the command: npx killer-skills add tvproductions/gzkit/gz-adr-eval. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for gz-adr-eval?

Key use cases include: Evaluating newly authored ADRs, Retroactively assessing existing ADR packages, Generating evaluation scorecards for ADR documents.

Which IDEs are compatible with gz-adr-eval?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for gz-adr-eval?

Requires access to ADR documents. Limited to ADR and OBPI decomposition evaluation.

gz-adr-eval (v6.0.0)

Name: gz-adr-eval
Availability: InStock
Author: tvproductions

Purpose

Structured quality evaluation for ADRs and their OBPI decompositions. This skill provides rubrics, challenge protocols, and a red-team prompt that form a blocking QC step between ADR authoring and human proposal/defense review.

It can be invoked for newly authored ADRs or for retroactive evaluation of an existing ADR package.

Trust Model

Layer 1 - Evidence Gathering. This tool reads ADR documents and produces evaluation scorecards. It does not modify ADR or brief content.

Reads: ADR document, OBPI briefs, evaluation framework template
Writes: EVALUATION_SCORECARD.md in the ADR directory
Does NOT modify: ADR content, brief content, registries, or ledgers

Invocation

text
1/gz-adr-eval ADR-X.Y.Z             # evaluate a specific ADR
2/gz-adr-eval ADR-X.Y.Z --red-team  # include the 10-challenge red-team protocol

Arguments:

Argument	Required	Description
`adr_id`	Yes	ADR identifier (for example `ADR-0.13.0`)
`--red-team`	No	Include the 10-challenge adversarial review protocol

When to Use

After drafting a new ADR and its OBPIs
Before moving a Draft ADR to Proposed / human defense review
When benchmarking the quality of an existing ADR package
When you want a structured red-team pass against scope, evidence, and decomposition

Procedure

Step 1: Locate the ADR and its OBPIs

Resolve the ADR document under docs/design/adr/**/ADR-X.Y.Z-*/ADR-X.Y.Z-*.md
List all OBPI briefs in obpis/ (preferred) or briefs/ (legacy)
Read the evaluation framework from assets/ADR_EVALUATION_FRAMEWORK.md

Step 2: Score ADR Quality (Part 1 - 8 Dimensions)

Read Part 1 of the framework and score the ADR on each dimension (1-4 scale):

#	Dimension	Weight
1	Problem Clarity	15%
2	Decision Justification	15%
3	Feature Checklist Completeness	15%
4	OBPI Decomposition Quality	15%
5	Lane Assignment Correctness	10%
6	Scope Discipline	10%
7	Evidence Requirements	10%
8	Architectural Alignment	10%

For each dimension, work through the checklist items in the framework and score based on how many checklist items pass with path-level evidence.

Step 3: Score OBPI Quality (Part 2 - 5 Dimensions)

For each OBPI, score on 5 dimensions (1-4 scale):

Dimension	Question
Independence	Can this OBPI be completed without waiting for others?
Testability	Can completion be verified with commands?
Value	What concrete capability would be lost if this OBPI were removed?
Size	Is this a 1-3 day work unit?
Clarity	Could a different agent implement this without ambiguity?

Step 4: Run Red-Team Challenges (Optional - Part 3)

If --red-team is specified, or if the evaluator wants stronger adversarial review, work through all 10 structured challenges from the framework.

Every challenge must be engaged. N/A is not acceptable.

Step 5: Determine Verdict

Apply the framework thresholds:

ADR Weighted Total	Verdict
>= 3.0	GO - Ready for proposal/defense review
2.5 - 3.0	CONDITIONAL GO - Address weaknesses, then re-evaluate
< 2.5	NO GO - Structural revision required

OBPI threshold: Average >= 3.0 per OBPI. Any OBPI scoring 1 on any dimension must be revised.

Red-team threshold: <= 2 failures = GO, 3-4 = CONDITIONAL GO, >= 5 = NO GO.

Step 6: Record Scorecard

Write EVALUATION_SCORECARD.md in the ADR directory using the summary template from the framework. Include:

All ADR dimension scores with weighted totals
All OBPI dimension scores with averages
Red-team challenge results when run
Overall verdict (GO / CONDITIONAL GO / NO GO)
Action items for any deficiencies

Step 7: Gate Decision

GO: proceed to human proposal/defense review
CONDITIONAL GO: revise the ADR or OBPIs, then re-run evaluation
NO GO: return to authoring; do not proceed to proposal/defense

External Red-Team Review

For adversarial review by a separate model:

Read the red-team model prompt from Part 4 of the framework
Copy the prompt and append the target ADR document
Send it to a separate model
Review the challenge report and address findings before proposal/defense

Assets

Evaluation Framework: assets/ADR_EVALUATION_FRAMEWORK.md
- Part 1: ADR quality rubric (8 weighted dimensions)
- Part 2: OBPI quality rubric (5 dimensions)
- Part 3: Red-team challenge protocol (10 challenges)
- Part 4: Red-team model prompt
- Part 5: Scoring summary template

Failure Modes

Failure	Cause	Resolution
Framework not found	`assets/ADR_EVALUATION_FRAMEWORK.md` missing	Repair skill directory structure
ADR not found	No ADR package matches the requested ID	Verify ADR exists and uses canonical naming
No OBPIs found	`obpis/` and `briefs/` are empty	ADR must have co-created OBPIs before evaluation
Scorecard not written	Path or permission error	Verify ADR directory is writable
NO GO verdict	ADR has structural deficiencies	Revise ADR and re-run evaluation before proposal/defense

Acceptance Rules

All 8 ADR dimensions are scored with evidence-based rationale
All OBPIs are scored on all 5 dimensions
EVALUATION_SCORECARD.md is written to the ADR directory
Verdict follows the threshold rules with no manual override
NO GO blocks progression to human proposal/defense

Related Skills

Skill	Relationship
`gz-adr-create`	Authoring workflow that should invoke evaluation before proposal
`gz-adr-manager`	Compatibility alias that inherits `gz-adr-create` behavior
`gz-adr-check`	Evidence/coverage checks; different concern
`gz-adr-audit`	Post-completion audit; downstream phase
`gz-adr-closeout-ceremony`	Closeout occurs after implementation, not authoring

References

Evaluation framework: assets/ADR_EVALUATION_FRAMEWORK.md
ADR lifecycle: docs/governance/GovZero/adr-lifecycle.md
GovZero charter: docs/governance/GovZero/charter.md
Parity origin: ../airlineops/.github/skills/gz-adr-eval/SKILL.md

gz-adr-eval — community gz-adr-eval, community, ide skills, Claude Code, Cursor, Windsurf

About this Skill

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for gz-adr-eval

! Prerequisites & Limits

Browser Sandbox Environment

⚡️ Ready to unleash?

gz-adr-eval (v6.0.0)

Purpose

Trust Model

Invocation

When to Use

Procedure

Step 1: Locate the ADR and its OBPIs

Step 2: Score ADR Quality (Part 1 - 8 Dimensions)

Step 3: Score OBPI Quality (Part 2 - 5 Dimensions)

Step 4: Run Red-Team Challenges (Optional - Part 3)

Step 5: Determine Verdict

Step 6: Record Scorecard

Step 7: Gate Decision

External Red-Team Review

Assets

Failure Modes

Acceptance Rules

Related Skills

References

FAQ & Installation Steps

? Frequently Asked Questions

What is gz-adr-eval?

How do I install gz-adr-eval?

What are the use cases for gz-adr-eval?

Which IDEs are compatible with gz-adr-eval?

Are there any limitations for gz-adr-eval?

↓ How To Install

Related Skills

Looking for an alternative to gz-adr-eval or another community skill for your workflow? Explore these related open-source skills.

openclaw-release-maintainer

widget-generator

flags

pr-review