KS
Killer-Skills

agent-device — Categories.community

Verified
v1.0.0
GitHub

About this Skill

Perfect for Mobile Automation Agents needing CLI control over iOS and Android devices. CLI to control iOS and Android devices for AI agents

callstackincubator callstackincubator
[0]
[0]
Updated: 3/4/2026

Quality Score

Top 5%
37
Excellent
Based on code quality & docs
Installation
SYS Universal Install (Auto-Detect)
Cursor IDE Windsurf IDE VS Code IDE
> npx killer-skills add callstackincubator/agent-device

Agent Capability Analysis

The agent-device MCP Server by callstackincubator is an open-source Categories.community integration for Claude and other AI agents, enabling seamless task automation and capability expansion.

Ideal Agent Persona

Perfect for Mobile Automation Agents needing CLI control over iOS and Android devices.

Core Value

Empowers agents to automate mobile interactions using snapshot refs for exploration and selectors for deterministic replay, leveraging CLI protocols for seamless device control.

Capabilities Granted for agent-device MCP Server

Automating mobile app testing
Debugging crash flows on iOS and Android
Replaying maintenance flows for QA bug hunts

! Prerequisites & Limits

  • Requires CLI access
  • Limited to iOS and Android devices
  • Needs snapshot refs or selectors for operation
Project
SKILL.md
11.9 KB
.cursorrules
1.2 KB
package.json
240 B
Ready
UTF-8

# Tags

[No tags]
SKILL.md
Readonly

Mobile Automation with agent-device

For exploration, use snapshot refs. For deterministic replay, use selectors. For structured exploratory QA bug hunts and reporting, use ../dogfood/SKILL.md.

Start Here (Read This First)

Use this skill as a router, not a full manual.

  1. Pick one mode:
    • Normal interaction flow
    • Debug/crash flow
    • Replay maintenance flow
  2. Run one canonical flow below.
  3. Open references only if blocked.

Decision Map

  • No target context yet: devices -> pick target -> open.
  • Normal UI task: open -> snapshot -i -> press/fill -> diff snapshot -i -> close
  • Debug/crash: open <app> -> logs clear --restart -> reproduce -> network dump -> logs path -> targeted grep
  • Replay drift: replay -u <path> -> verify updated selectors
  • Remote multi-tenant run: allocate lease -> run commands with tenant isolation flags -> heartbeat/release lease
  • Device-scope isolation run: set iOS simulator set / Android allowlist -> run selectors within scope only

Canonical Flows

1) Normal Interaction Flow

bash
1agent-device open Settings --platform ios 2agent-device snapshot -i 3agent-device press @e3 4agent-device diff snapshot -i 5agent-device fill @e5 "test" 6agent-device close

2) Debug/Crash Flow

bash
1agent-device open MyApp --platform ios 2agent-device logs clear --restart 3agent-device network dump 25 4agent-device logs path

Logging is off by default. Enable only for debugging windows. logs clear --restart requires an active app session (open <app> first).

3) Replay Maintenance Flow

bash
1agent-device replay -u ./session.ad

4) Remote Tenant Lease Flow (HTTP JSON-RPC)

bash
1# Allocate lease 2curl -sS http://127.0.0.1:${AGENT_DEVICE_DAEMON_HTTP_PORT}/rpc \ 3 -H "content-type: application/json" \ 4 -H "Authorization: Bearer <token>" \ 5 -d '{"jsonrpc":"2.0","id":"alloc-1","method":"agent_device.lease.allocate","params":{"runId":"run-123","tenantId":"acme","ttlMs":60000}}' 6 7# Use lease in tenant-isolated command execution 8agent-device --daemon-transport http \ 9 --tenant acme \ 10 --session-isolation tenant \ 11 --run-id run-123 \ 12 --lease-id <lease-id> \ 13 session list --json 14 15# Heartbeat and release 16curl -sS http://127.0.0.1:${AGENT_DEVICE_DAEMON_HTTP_PORT}/rpc \ 17 -H "content-type: application/json" \ 18 -H "Authorization: Bearer <token>" \ 19 -d '{"jsonrpc":"2.0","id":"hb-1","method":"agent_device.lease.heartbeat","params":{"leaseId":"<lease-id>","ttlMs":60000}}' 20curl -sS http://127.0.0.1:${AGENT_DEVICE_DAEMON_HTTP_PORT}/rpc \ 21 -H "content-type: application/json" \ 22 -H "Authorization: Bearer <token>" \ 23 -d '{"jsonrpc":"2.0","id":"rel-1","method":"agent_device.lease.release","params":{"leaseId":"<lease-id>"}}'

Command Skeleton (Minimal)

Session and navigation

bash
1agent-device devices 2agent-device devices --platform ios --ios-simulator-device-set /tmp/tenant-a/simulators 3agent-device devices --platform android --android-device-allowlist emulator-5554,device-1234 4agent-device ensure-simulator --device "iPhone 16" --ios-simulator-device-set /tmp/tenant-a/simulators 5agent-device ensure-simulator --device "iPhone 16" --runtime com.apple.CoreSimulator.SimRuntime.iOS-18-4 --ios-simulator-device-set /tmp/tenant-a/simulators --boot 6agent-device open [app|url] [url] 7agent-device open [app] --relaunch 8agent-device close [app] 9agent-device install <app> <path-to-binary> 10agent-device reinstall <app> <path-to-binary> 11agent-device session list

Use boot only as fallback when open cannot find/connect to a ready target. For Android emulators by AVD name, use boot --platform android --device <avd-name>. For Android emulators without GUI, add --headless. Use --target mobile|tv with --platform (required) to pick phone/tablet vs TV targets (AndroidTV/tvOS).

Isolation scoping quick reference:

  • --ios-simulator-device-set <path> scopes iOS simulator discovery + command execution to one simulator set.
  • --android-device-allowlist <serials> scopes Android discovery/selection to comma/space separated serials.
  • Scope is applied before selectors (--device, --udid, --serial); out-of-scope selectors fail with DEVICE_NOT_FOUND.
  • With iOS simulator-set scope enabled, iOS physical devices are not enumerated.

Simulator provisioning quick reference:

  • Use ensure-simulator to create or reuse a named iOS simulator inside a device set before starting a session.
  • --device <name> is required (e.g. "iPhone 16 Pro"). --runtime <id> pins the runtime; omit to use the newest compatible one.
  • --boot boots it immediately. Returns udid, device, runtime, ios_simulator_device_set, created, booted.
  • Idempotent: safe to call repeatedly; reuses an existing matching simulator by default.

TV quick reference:

  • AndroidTV: open/apps use TV launcher discovery automatically.
  • TV target selection works on emulators/simulators and connected physical devices (AndroidTV + AppleTV).
  • tvOS: runner-driven interactions and snapshots are supported (snapshot, wait, press, fill, get, scroll, back, home, app-switcher, record and related selector flows).
  • tvOS back/home/app-switcher map to Siri Remote actions (menu, home, double-home) in the runner.
  • tvOS follows iOS simulator-only command semantics for helpers like pinch, settings, and push.

Snapshot and targeting

bash
1agent-device snapshot -i 2agent-device diff snapshot -i 3agent-device find "Sign In" click 4agent-device press @e1 5agent-device fill @e2 "text" 6agent-device is visible 'id="anchor"'

press is canonical tap command; click is an alias.

Utilities

bash
1agent-device appstate 2agent-device clipboard read 3agent-device clipboard write "token" 4agent-device keyboard status 5agent-device keyboard dismiss 6agent-device perf --json 7agent-device network dump [limit] [summary|headers|body|all] 8agent-device push <bundle|package> <payload.json|inline-json> 9agent-device trigger-app-event screenshot_taken '{"source":"qa"}' 10agent-device get text @e1 11agent-device screenshot out.png 12agent-device settings permission grant notifications 13agent-device settings permission reset camera 14agent-device trace start 15agent-device trace stop ./trace.log

Batch (when sequence is already known)

bash
1agent-device batch --steps-file /tmp/batch-steps.json --json

Performance Check

  • Use agent-device perf --json (or metrics --json) after open.
  • For detailed metric semantics, caveats, and interpretation guidance, see references/perf-metrics.md.

Guardrails (High Value Only)

  • Re-snapshot after UI mutations (navigation/modal/list changes).
  • Prefer snapshot -i; scope/depth only when needed.
  • Use refs for discovery, selectors for replay/assertions.
  • find "<query>" click --json returns { ref, locator, query, x, y } — all derived from the matched snapshot node. Do not rely on these fields from raw press/click responses for observability; use find instead.
  • Use fill for clear-then-type semantics; use type for focused append typing.
  • Use install for in-place app upgrades (keep app data when platform permits), and reinstall for deterministic fresh-state runs.
  • App binary format support for install/reinstall: Android .apk/.aab, iOS .app/.ipa.
  • Android .aab requires bundletool in PATH, or AGENT_DEVICE_BUNDLETOOL_JAR=<path-to-bundletool-all.jar> with java in PATH.
  • Android .aab optional: set AGENT_DEVICE_ANDROID_BUNDLETOOL_MODE=<mode> to control bundletool build-apks --mode (default: universal).
  • iOS .ipa: extract/install from Payload/*.app; when multiple app bundles are present, <app> is used as a bundle id/name hint.
  • iOS appstate is session-scoped; Android appstate is live foreground state. iOS responses include device_udid and ios_simulator_device_set for isolation verification.
  • iOS open responses include device_udid and ios_simulator_device_set to confirm which simulator handled the session.
  • Clipboard helpers: clipboard read / clipboard write <text> are supported on Android and iOS simulators; iOS physical devices are not supported yet.
  • Android keyboard helpers: keyboard status|get|dismiss report keyboard visibility/type and dismiss via keyevent when visible.
  • network dump is best-effort and parses HTTP(s) entries from the session app log file.
  • Biometric settings: iOS simulator supports settings faceid|touchid <match|nonmatch|enroll|unenroll>; Android supports settings fingerprint <match|nonmatch> where runtime tooling is available.
  • For AndroidTV/tvOS selection, always pair --target with --platform (ios, android, or apple alias); target-only selection is invalid.
  • push simulates notification delivery:
    • iOS simulator uses APNs-style payload JSON.
    • Android uses broadcast action + typed extras (string/boolean/number).
  • trigger-app-event requires app-defined deep-link hooks and URL template configuration (AGENT_DEVICE_APP_EVENT_URL_TEMPLATE or platform-specific variants).
  • trigger-app-event requires an active session or explicit selectors (--platform, --device, --udid, --serial); on iOS physical devices, custom-scheme triggers require active app context.
  • Canonical trigger behavior and caveats are documented in website/docs/docs/commands.md under App event triggers.
  • Permission settings are app-scoped and require an active session app: settings permission <grant|deny|reset> <camera|microphone|photos|contacts|notifications> [full|limited]
  • iOS simulator permission alerts: use alert wait then alert accept/dismissaccept/dismiss retry internally for up to 2 s so you do not need manual sleeps. See references/permissions.md.
  • full|limited mode applies only to iOS photos; other targets reject mode.
  • On Android, non-ASCII fill/type may require an ADB keyboard IME on some system images; only install IME APKs from trusted sources and verify checksum/signature.
  • If using --save-script, prefer explicit path syntax (--save-script=flow.ad or ./flow.ad).
  • For tenant-isolated remote runs, always pass --tenant, --session-isolation tenant, --run-id, and --lease-id together.
  • Use short lease TTLs and heartbeat only while work is active; release leases immediately after run completion/failure.
  • Env equivalents for scoped runs: AGENT_DEVICE_IOS_SIMULATOR_DEVICE_SET (compat IOS_SIMULATOR_DEVICE_SET) and AGENT_DEVICE_ANDROID_DEVICE_ALLOWLIST (compat ANDROID_DEVICE_ALLOWLIST).

Security and Trust Notes

  • Prefer a preinstalled agent-device binary over on-demand package execution.
  • If install is required, pin an exact version (for example: npx --yes agent-device@<exact-version> --help).
  • Signing/provisioning environment variables are optional, sensitive, and only for iOS physical-device setup.
  • Logs/artifacts are written under ~/.agent-device; replay scripts write to explicit paths you provide.
  • For remote daemon mode, prefer AGENT_DEVICE_DAEMON_SERVER_MODE=http|dual with AGENT_DEVICE_HTTP_AUTH_HOOK and tenant-scoped lease admission.
  • Keep logging off unless debugging and use least-privilege/isolated environments for autonomous runs.

Common Mistakes

  • Mixing debug flow into normal runs (keep logs off unless debugging).
  • Continuing to use stale refs after screen transitions.
  • Using URL opens with Android --activity (unsupported combination).
  • Treating boot as default first step instead of fallback.

References

Related Skills

Looking for an alternative to agent-device or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

View All

widget-generator

Logo of f
f

widget-generator is an open-source AI agent skill for creating widget plugins that are injected into prompt feeds on prompts.chat. It supports two rendering modes: standard prompt widgets using default PromptCard styling and custom render widgets built as full React components.

149.6k
0
Design

chat-sdk

Logo of lobehub
lobehub

chat-sdk is a unified TypeScript SDK for building chat bots across multiple platforms, providing a single interface for deploying bot logic.

73.0k
0
Communication

zustand

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication

data-fetching

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication