playwright-testing — community playwright-testing, phaserjs-oakwoods, chongdashu, community, ai agent skill, ide skills, agent automation, AI agent skills, Claude Code, Cursor, Windsurf

v1.0.0
GitHub

About this Skill

Ideal for Frontend Testing Agents seeking to maximize signal and minimize test failures through observable and deterministic testing Plan, implement, and debug frontend tests: unit/integration/E2E/visual/a11y. Use for Playwright/Cypress/Vitest/Jest/RTL, flaky test triage, CI stabilization, and canvas/WebGL games (Phaser) needing de

chongdashu chongdashu
[0]
[0]
Updated: 3/12/2026

Quality Score

Top 5%
45
Excellent
Based on code quality & docs
Installation
SYS Universal Install (Auto-Detect)
> npx killer-skills add chongdashu/phaserjs-oakwoods/playwright-testing
Supports 19+ Platforms
Cursor
Windsurf
VS Code
Trae
Claude
OpenClaw
+12 more

Agent Capability Analysis

The playwright-testing skill by chongdashu is an open-source community AI agent skill for Claude Code and other IDE workflows, helping agents execute tasks with better context, repeatability, and domain-specific guidance.

Ideal Agent Persona

Ideal for Frontend Testing Agents seeking to maximize signal and minimize test failures through observable and deterministic testing

Core Value

Empowers agents to eliminate nondeterminism and make apps observable, ensuring reliable confidence in frontend testing through protocols like playwright-testing, enabling safe refactors and actionable failure reports

Capabilities Granted for playwright-testing

Automating frontend tests to minimize failures
Debugging nondeterministic test issues
Optimizing test layers for maximum signal

! Prerequisites & Limits

  • Requires playwright-testing framework integration
  • Frontend testing scope only
Project
SKILL.md
7.8 KB
.cursorrules
1.2 KB
package.json
240 B
Ready
UTF-8

# Tags

[No tags]
SKILL.md
Readonly

Frontend Testing

Unlock reliable confidence fast: enable safe refactors by choosing the right test layer, making the app observable, and eliminating nondeterminism so failures are actionable.

Philosophy: Confidence Per Minute

Frontend tests fail for two reasons: the product is broken, or the test is lying. Your job is to maximize signal and minimize “test is lying”.

Before writing a test, ask:

  • What user risk am I covering (money, progression, auth, data loss, “can’t start” crashes)?
  • What’s the narrowest layer that catches this bug class (pure logic vs UI vs full browser)?
  • What nondeterminism exists (time, RNG, async loading, network, animations, fonts, GPU)?
  • What “ready” signal can I wait on besides setTimeout?
  • What should a failure print/screenshot so it’s diagnosable in CI?

Core principles:

  1. Test the contract, not the implementation: assert stable user-meaningful outcomes and public seams.
  2. Prefer determinism over retries: make time/RNG/network controllable; remove flake at the source.
  3. Observe like a debugger: console errors, network failures, screenshots, and state dumps on failure.
  4. One critical flow first: a reliable smoke test beats 50 flaky tests.

Workflow Decision Tree

Pick the test type by the cheapest layer that provides the needed confidence:

  • Unit tests (fastest): pure functions, reducers, validators, math, pathfinding, deterministic simulation steps.
  • Component/integration tests (medium): UI behavior with mocked IO (React Testing Library / Vue Testing Library / Testing Library DOM).
  • E2E tests (slowest, highest confidence): critical user flows across routing, storage, real bundling/runtime.
  • Visual regression (specialized): layout/pixel regressions; for canvas/WebGL, only after locking determinism.
  • A11y checks: great for DOM UIs; limited value for pure canvas unless you expose accessible DOM overlays.

Quick Start (Any Project)

  1. Define 1 smoke flow: “page loads → user can start → one key action works”.
  2. Choose runner:
    • Prefer Playwright for browser E2E + screenshots.
    • Prefer Testing Library for DOM component behavior.
    • Prefer unit tests for logic you can run without a browser.
  3. Add a “ready” signal in the app (DOM marker, window flag, or game event) and wait on that.
  4. Fail loudly: treat console errors and failed requests as test failures.
  5. Stabilize: seed RNG, freeze time, fix viewport/DPR, disable animations, and remove network variability.

Playwright Patterns (Especially Useful For Games)

Use Playwright when you need “real browser” confidence:

  • Drive input via mouse/keyboard/touch; treat the canvas like the user does.
  • Add a test seam: expose a small, stable test API on window (read-only state + a few commands).
  • Prefer waitForFunction-style readiness over sleep; gate on “scene ready” / “assets loaded” / “first frame rendered”.
  • For screenshots: lock viewport, device scale factor, fonts, and animation timing.
  • For 9-slice / canvas UI regressions: add a dedicated UI harness scene/page and assert via targeted screenshots (see references/phaser-canvas-testing.md).

If using the Playwright MCP tools (browser automation inside Codex), follow the same mindset:

  • Use browser_console_messages and browser_network_requests to catch silent failures.
  • Use browser_evaluate to assert window.__TEST__ state and to set up deterministic mode.
  • Use browser_take_screenshot for visual assertions after determinism is enforced.

Reconnaissance-Then-Action (Borrowed From Real Debugging)

When a UI is dynamic, don’t guess selectors—recon first, then act:

Quick decision guide:

Task → Is it static HTML (no JS runtime needed)?
  ├─ Yes → read the HTML to find stable selectors/content, then automate
  └─ No  → treat as dynamic: run the app, wait for readiness, then inspect rendered state
  1. Navigate and wait for readiness:
    • For many webapps: wait for a meaningful “loaded” element (preferred).
    • networkidle can help for SPAs, but avoid it if the app uses websockets/polling.
  2. Capture evidence (what the user actually sees):
    • screenshot (full page for DOM; targeted for canvas)
    • console errors + failed requests
  3. Discover selectors from the rendered state:
    • prefer role/text/label selectors over brittle CSS
  4. Execute actions using discovered selectors and re-check state.

Common pitfall: ❌ Inspect/interact before the app is ready. ✅ Wait on an explicit ready signal (DOM marker or window.__TEST__.ready), not a sleep.

Server Lifecycle Helper (Playwright E2E)

When the dev server isn’t already running, use the bundled helper as a black box:

  • Run python scripts/with_server.py --help first.
  • Start one (or multiple) servers, wait for their ports, then run your test command.

Example:

bash
1python scripts/with_server.py --server "npm run dev" --port 5173 -- npm test

Flake Reduction Checklist

  • Replace sleeps with explicit readiness conditions.
  • Control time (Date.now, timers), RNG, and animation loops.
  • Make network deterministic (mock, record/replay, or run against a seeded local backend).
  • Eliminate “first-run” differences (asset caches, fonts) or warm them explicitly.
  • Lock environment: viewport, DPR, locale/timezone, and rendering settings.

Anti-Patterns to Avoid

Testing the wrong layer: E2E tests for pure logic. Better: unit tests for logic; reserve E2E for integration contracts.

Testing implementation details: asserting DOM structure/classnames or internal engine objects. Better: assert user-meaningful outputs (text, navigation, score/HP changes) or a small stable test seam.

Sleep-driven tests: wait 2s then click. Better: wait on explicit readiness (DOM marker, event, window flag).

Uncontrolled randomness: RNG/time-based behaviors in assertions. Better: seed RNG, freeze time, and assert stable invariants.

Pixel snapshots without determinism (especially canvas/WebGL). Better: add deterministic mode first; then screenshot selectively.

Snapshot explosion: hundreds of snapshots that no one can interpret. Better: keep snapshots targeted (critical screens); prefer specific assertions for behavior.

Retries as a strategy: “just bump retries in CI”. Better: fix readiness and determinism; use retries only as temporary guardrails.

Variation Guidance (Prevent One-Size-Fits-All)

Vary the approach based on:

  • UI type: DOM app vs canvas/WebGL game vs hybrid.
  • Risk: core revenue/progression flows get E2E first; edge UI polish gets component tests.
  • CI constraints: headless-only, limited GPU, slow CPUs, no audio devices.
  • Test seam availability: if you can add a stable window.__TEST__ API, assert state; if not, stick to black-box input/output.

Remember

You can make almost any frontend (including canvas/WebGL games) testable by adding a tiny, stable seam for readiness + state. This skill is meant to empower creative, high-signal testing rather than cargo-cult checklists. Aim for tests that are boring to maintain: deterministic, explicit about readiness, and rich in failure evidence. One reliable smoke test is the foundation; everything else compounds from there.

Bundled Resources

Read these only when needed:

  • references/playwright-mcp-cheatsheet.md: patterns for using Playwright MCP tools for assertions, waiting, and diagnostics.
  • references/phaser-canvas-testing.md: deterministic mode + hooks for Phaser/canvas/WebGL games.
  • references/flake-reduction.md: deeper flake triage and stabilization tactics.

Use these scripts as black boxes (run --help first; don’t read source unless you must):

  • scripts/with_server.py: start/wait/stop one or more dev servers around a test command.
  • scripts/imgdiff.py: lightweight screenshot diff helper (requires pip install pillow).

FAQ & Installation Steps

These questions and steps mirror the structured data on this page for better search understanding.

? Frequently Asked Questions

What is playwright-testing?

Ideal for Frontend Testing Agents seeking to maximize signal and minimize test failures through observable and deterministic testing Plan, implement, and debug frontend tests: unit/integration/E2E/visual/a11y. Use for Playwright/Cypress/Vitest/Jest/RTL, flaky test triage, CI stabilization, and canvas/WebGL games (Phaser) needing de

How do I install playwright-testing?

Run the command: npx killer-skills add chongdashu/phaserjs-oakwoods/playwright-testing. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for playwright-testing?

Key use cases include: Automating frontend tests to minimize failures, Debugging nondeterministic test issues, Optimizing test layers for maximum signal.

Which IDEs are compatible with playwright-testing?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for playwright-testing?

Requires playwright-testing framework integration. Frontend testing scope only.

How To Install

  1. 1. Open your terminal

    Open the terminal or command line in your project directory.

  2. 2. Run the install command

    Run: npx killer-skills add chongdashu/phaserjs-oakwoods/playwright-testing. The CLI will automatically detect your IDE or AI agent and configure the skill.

  3. 3. Start using the skill

    The skill is now active. Your AI agent can use playwright-testing immediately in the current project.

Related Skills

Looking for an alternative to playwright-testing or another community skill for your workflow? Explore these related open-source skills.

View All

widget-generator

Logo of f
f

Generate customizable widget plugins for the prompts.chat feed system

149.6k
0
Design

linear

Logo of lobehub
lobehub

Linear issue management. MUST USE when: (1) user mentions LOBE-xxx issue IDs (e.g. LOBE-4540), (2) user says linear, linear issue, link linear, (3) creating PRs that reference Linear issues. Provides

73.4k
0
Communication

testing

Logo of lobehub
lobehub

Testing guide using Vitest. Use when writing tests (.test.ts, .test.tsx), fixing failing tests, improving test coverage, or debugging test issues. Triggers on test creation, test debugging, mock setup

73.3k
0
Communication

zustand

Logo of lobehub
lobehub

Zustand state management guide. Use when working with store code (src/store/**), implementing actions, managing state, or creating slices. Triggers on Zustand store development, state management questions, or action implementation.

72.8k
0
Communication