eval-harness
[ 精选 ]Eval Harness 是一种用于 AI 代码开发的评估框架
浏览和安装 Killer-Skills 目录中的数千个 AI Agent 技能。支持 Claude Code、Windsurf、Cursor 等。
Eval Harness 是一种用于 AI 代码开发的评估框架
本地化技能摘要: Core principle: Verify before implementing. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.
本地化技能摘要: Core principle: Evidence before claims, always. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.
本地化技能摘要: Core principle: Systematic directory selection + safety verification = reliable isolation. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.
本地化技能摘要: Core principle: Verify tests → Present options → Execute choice → Clean up. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.
本地化技能摘要: Applies the "deterministic collection + LLM judgment" principle: scripts collect facts exhaustively, then an LLM cross-reads the full context and produces verdicts. It covers ai-agents, anthropic, claude workflows. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.