event-engine-baseline-testing — for Claude Code event-engine-baseline-testing, honeclaw, community, for Claude Code, ide skills, event engine optimization, event engine testing, Rust code formatting, event engine push quality work, cargo test command

v1.0.0

About this Skill

Ideal for AI agents that need event engine baseline testing. Event-engine-baseline-testing is a set of tools and techniques designed to help developers optimize the quality of their event engine, ensuring that it meets the needs of their application and provides a reliable and efficient experience for users.

Features

Run event-engine push quality work using `crates/hone-event-engine/src/pollers/*`
Test event-engine logic using `cargo test -p hone-event-engine --lib`
Verify config schema/default changes using `cargo test -p hone-core --lib`
Format Rust code using `cargo fmt --all -- --check`
Use `rtk` prefix for shell commands except native `git push`

# Core Topics

B-M-Capital-Research B-M-Capital-Research
[279]
[40]
Updated: 4/24/2026

Killer-Skills Review

Decision support comes first. Repository text comes second.

Reviewed Landing Page Review Score: 10/11

Killer-Skills keeps this page indexable because it adds recommendation, limitations, and review signals beyond the upstream repository text.

Original recommendation layer Concrete use-case guidance Explicit limitations and caution Quality floor passed for review Locale and body language aligned
Review Score
10/11
Quality Score
65
Canonical Locale
en
Detected Body Locale
en

Ideal for AI agents that need event engine baseline testing. Event-engine-baseline-testing is a set of tools and techniques designed to help developers optimize the quality of their event engine, ensuring that it meets the needs of their application and provides a reliable and efficient experience for users.

Core Value

event-engine-baseline-testing helps agents event engine baseline testing. The event engine baseline testing skill is designed to help developers optimize their event engine quality. This AI agent skill supports Claude Code, Cursor, and Windsurf workflows.

Ideal Agent Persona

Ideal for AI agents that need event engine baseline testing.

Capabilities Granted for event-engine-baseline-testing

Applying Event Engine Baseline Testing
Applying Use this skill for event-engine push quality work, especially changes touching:
Applying crates/hone-event-engine/src/pollers/

! Prerequisites & Limits

  • Do not commit if main is behind, or if staged/unrelated files would make an isolated commit unsafe.
  • expected llm title only after engine: expected result for the saved title-only rerun script.
  • Live model drift check against the saved title-only samples:

Source Boundary

The section below is imported from the upstream repository and should be treated as secondary evidence. Use the Killer-Skills review above as the primary layer for fit, risk, and installation decisions.

After The Review

Decide The Next Action Before You Keep Reading Repository Material

Killer-Skills should not stop at opening repository instructions. It should help you decide whether to install this skill, when to cross-check against trusted collections, and when to move into workflow rollout.

Labs Demo

Browser Sandbox Environment

⚡️ Ready to unleash?

Experience this Agent in a zero-setup browser environment powered by WebContainers. No installation required.

Boot Container Sandbox

FAQ & Installation Steps

These questions and steps mirror the structured data on this page for better search understanding.

? Frequently Asked Questions

What is event-engine-baseline-testing?

Ideal for AI agents that need event engine baseline testing. Event-engine-baseline-testing is a set of tools and techniques designed to help developers optimize the quality of their event engine, ensuring that it meets the needs of their application and provides a reliable and efficient experience for users.

How do I install event-engine-baseline-testing?

Run the command: npx killer-skills add B-M-Capital-Research/honeclaw/event-engine-baseline-testing. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for event-engine-baseline-testing?

Key use cases include: Applying Event Engine Baseline Testing, Applying Use this skill for event-engine push quality work, especially changes touching:, Applying crates/hone-event-engine/src/pollers/.

Which IDEs are compatible with event-engine-baseline-testing?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for event-engine-baseline-testing?

Do not commit if main is behind, or if staged/unrelated files would make an isolated commit unsafe.. expected llm title only after engine: expected result for the saved title-only rerun script.. Live model drift check against the saved title-only samples:.

How To Install

  1. 1. Open your terminal

    Open the terminal or command line in your project directory.

  2. 2. Run the install command

    Run: npx killer-skills add B-M-Capital-Research/honeclaw/event-engine-baseline-testing. The CLI will automatically detect your IDE or AI agent and configure the skill.

  3. 3. Start using the skill

    The skill is now active. Your AI agent can use event-engine-baseline-testing immediately in the current project.

Upstream Repository Material

The section below is imported from the upstream repository and should be treated as secondary evidence. Use the Killer-Skills review above as the primary layer for fit, risk, and installation decisions.

Upstream Source

event-engine-baseline-testing

Improve the quality of your event engine with our AI agent skill, designed for Claude Code, Cursor, and Windsurf. Learn how to optimize your event engine with

SKILL.md
Readonly
Upstream Repository Material
The section below is imported from the upstream repository and should be treated as secondary evidence. Use the Killer-Skills review above as the primary layer for fit, risk, and installation decisions.
Supporting Evidence

Event Engine Baseline Testing

Use this skill for event-engine push quality work, especially changes touching:

  • crates/hone-event-engine/src/pollers/*
  • crates/hone-event-engine/src/router.rs
  • crates/hone-event-engine/src/digest.rs
  • crates/hone-event-engine/src/news_classifier.rs
  • crates/hone-core/src/config/event_engine.rs
  • crates/hone-web-api/src/lib.rs
  • tests/fixtures/event_engine/news_classifier_baseline_*.json

Always obey the repository rule: prefix shell commands with rtk except native git push.

Start Checklist

Before changing code or fixtures, write a short todo that includes:

  1. Goal: which behavior is being changed or verified.
  2. Files: implementation files plus baseline/test files.
  3. Verification: exact commands, including whether a live LLM run is needed.
  4. Documentation: update the active plan/memory when the work affects the event-engine quality track; otherwise state why no doc sync is needed.

Check the worktree first:

bash
1rtk git status --short --branch

Do not commit if main is behind, or if staged/unrelated files would make an isolated commit unsafe.

Test Matrix

Run the smallest useful subset first, then broaden.

For event-engine logic:

bash
1rtk cargo test -p hone-event-engine --lib

For config schema/default changes:

bash
1rtk cargo test -p hone-core --lib

For web-api assembly changes, such as sink/classifier model wiring:

bash
1rtk cargo check -p hone-web-api

Always finish Rust work with:

bash
1rtk cargo fmt --all -- --check

If formatting fails, run:

bash
1rtk cargo fmt --all 2rtk cargo fmt --all -- --check

Existing News Baseline

The current reusable baseline is:

  • tests/fixtures/event_engine/news_classifier_baseline_2026-04-23.json
  • tests/regression/manual/test_event_engine_news_classifier_baseline.sh

The fixture stores real FMP titles/sites/symbols and expected post-tuning route decisions. It intentionally does not store full FMP article bodies. Some entries therefore have both:

  • expected_llm_after_engine: original live run result with FMP text in prompt.
  • expected_llm_title_only_after_engine: expected result for the saved title-only rerun script.

The CI-safe source/kind drift test is in:

  • crates/hone-event-engine/src/pollers/news.rs
  • test: live_news_classifier_baseline_source_policy_is_stable

Run it directly when touching source classification or transcript splitting:

bash
1rtk cargo test -p hone-event-engine pollers::news::tests::live_news_classifier_baseline_source_policy_is_stable --lib

Rerun Baseline

Offline check, no network/API cost:

bash
1rtk bash tests/regression/manual/test_event_engine_news_classifier_baseline.sh

Live model drift check against the saved title-only samples:

bash
1rtk env RUN_EVENT_ENGINE_LLM_BASELINE=1 bash tests/regression/manual/test_event_engine_news_classifier_baseline.sh

To explicitly check the current recommended model:

bash
1rtk env RUN_EVENT_ENGINE_LLM_BASELINE=1 EVENT_ENGINE_NEWS_CLASSIFIER_MODEL=amazon/nova-lite-v1 bash tests/regression/manual/test_event_engine_news_classifier_baseline.sh

To compare another model:

bash
1rtk env RUN_EVENT_ENGINE_LLM_BASELINE=1 EVENT_ENGINE_NEWS_CLASSIFIER_MODEL=x-ai/grok-4.1-fast bash tests/regression/manual/test_event_engine_news_classifier_baseline.sh

To collect a non-blocking drift report:

bash
1rtk env RUN_EVENT_ENGINE_LLM_BASELINE=1 ALLOW_EVENT_ENGINE_LLM_BASELINE_DRIFT=1 bash tests/regression/manual/test_event_engine_news_classifier_baseline.sh

Interpretation:

  • Any drift on noisy/title-only samples is a prompt/model review signal, not automatically a bug.
  • Drift that promotes opinion/list/preview/social noise should usually be fixed in source classification, router gating, or prompt rules.
  • Drift that demotes a concrete hard event should usually be fixed in prompt rules or deterministic fallback.
  • Do not update the expected baseline just to make a new model pass. First decide whether the new answer is better.

Daily Push Calibration Export

Daily Telegram calibration should start from the actual delivery evidence already stored in data/events.sqlite3. Use the read-only exporter to produce an ignored local JSON/Markdown snapshot for one actor and one local day:

bash
1rtk python3 scripts/diagnose_event_engine_daily_pushes.py --date 2026-04-23 --actor telegram::::8039067465

The default output directory is data/exports/event-engine-calibration/, which is ignored by git. The JSON report includes blank calibration_label and calibration_note fields. The Markdown report is for quick human review.

Suggested labels:

  • useful
  • noise
  • should_immediate
  • should_digest
  • should_filter
  • baseline_candidate

When the user marks a stable reusable case, copy only the durable public fields into tests/fixtures/event_engine/news_classifier_baseline_2026-04-23.json or a newer fixture. Do not commit daily exports, private runtime DB files, full copyrighted article bodies, or one-off labels that only explain a single noisy day.

Add New Baseline Samples

Add a new baseline when live logs or user feedback exposes a reusable decision case, such as:

  • a noisy source that should stay Low/digest;
  • a concrete hard event that should reach Medium digest or High immediate;
  • a transcript/earnings item that should be independently controllable;
  • a route timing issue that should not regress after future prompt/model changes.

Workflow:

  1. Pull or extract the real sample, but do not store API keys, private actor data beyond the stable actor key, or full copyrighted article bodies.
  2. Save only durable fields needed for regression:
    • symbol
    • site
    • title
    • expected source class
    • expected event kind
    • old/new model answers when known
    • expected route after engine
  3. If a result depends on article body that is not stored, add expected_llm_title_only_after_engine for the manual script.
  4. Update summary counts in the fixture (items, LLM item count, yes/no counts).
  5. Extend or adjust the CI-safe unit test if the fixture schema changes.
  6. Run:
bash
1rtk python3 -m json.tool tests/fixtures/event_engine/news_classifier_baseline_2026-04-23.json >/dev/null 2rtk bash tests/regression/manual/test_event_engine_news_classifier_baseline.sh 3rtk cargo test -p hone-event-engine pollers::news::tests::live_news_classifier_baseline_source_policy_is_stable --lib
  1. If the user explicitly asked for live model validation, also run the live script and report cost/latency/drift.

Preserve Semantics

Keep these invariants unless the user explicitly changes product policy:

  • Source classification and deterministic router behavior must be CI-safe and mockable.
  • Live LLM calls are manual validation only, never default CI gates.
  • earnings_call_transcript is a standalone controllable kind, not generic news_critical.
  • Legal ads, shareholder alerts, listicles, valuation commentary, and earnings previews should not become immediate alerts by LLM enthusiasm.
  • High/immediate routing must be explainable through kind, severity, actor preference, cap/cooldown, and sink result.

Closeout

Before final response:

  • Re-run the relevant test subset and report exact pass/fail results.
  • Update the active event-engine plan or automation memory if this work is part of the ongoing push-quality track.
  • Mention whether live LLM baseline was run, skipped, or only run offline.

Related Skills

Looking for an alternative to event-engine-baseline-testing or another community skill for your workflow? Explore these related open-source skills.

View All

openclaw-release-maintainer

Logo of openclaw
openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

333.8k
0
AI

widget-generator

Logo of f
f

Generate customizable widget plugins for the prompts.chat feed system

149.6k
0
AI

flags

Logo of vercel
vercel

The React Framework

138.4k
0
Browser

pr-review

Logo of pytorch
pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

98.6k
0
Developer