eval-harness
[ Destacado ]Eval Harness es un marco de evaluación para sesiones de Claude Code que permite medir la confiabilidad y el rendimiento de los agentes de AI
Explora e instala miles de skills de agentes de IA en el directorio de Killer-Skills. Compatible con Claude Code, Windsurf, Cursor y más.
Eval Harness es un marco de evaluación para sesiones de Claude Code que permite medir la confiabilidad y el rendimiento de los agentes de AI
Resumen localizado: Patient safety evaluation harness for healthcare application deployments. It covers ai-agents, anthropic, claude workflows. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.
Resumen localizado: GAN-inspired Generator-Evaluator agent harness for building high-quality applications autonomously. It covers ai-agents, anthropic, claude workflows. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.
Resumen localizado: Transform Claude Code into a fully autonomous agent system with persistent memory, scheduled operations, computer use, and task queuing. It covers ai-agents, anthropic, claude workflows.
Resumen localizado: Design and optimize AI agent action spaces, tool definitions, and observation formatting for higher completion rates. It covers ai-agents, anthropic, claude workflows.
Resumen localizado: Audit the active repo, MCP servers, plugins, connectors, env surfaces, and harness setup, then recommend the highest-value ECC-native skills, hooks, agents, and operator workflows. It covers ai-agents, anthropic, claude workflows. This AI agent skill supports Claude Code
It covers declarative, frontend, javascript workflows. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.
Un agente de pruebas de React es una herramienta que ejecuta pruebas para el núcleo de React