What is agent-device?

Perfect for Mobile Automation Agents needing CLI control over iOS and Android devices. CLI to control iOS and Android devices for AI agents

How do I install agent-device?

Run the command: npx killer-skills add callstackincubator/agent-device. It works with Cursor, Windsurf, VS Code, Claude Code, and 15+ other IDEs.

What are the use cases for agent-device?

Key use cases include: Automating mobile app testing, Debugging crash flows on iOS and Android, Replaying maintenance flows for QA bug hunts.

Which IDEs are compatible with agent-device?

This skill is compatible with Cursor, Windsurf, VS Code, Claude Code, GitHub Copilot, JetBrains, Cline, Roo Code, and many more. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for agent-device?

Requires CLI access. Limited to iOS and Android devices. Needs snapshot refs or selectors for operation.

Mobile Automation with agent-device

Name: agent-device
Availability: InStock
Rating: 1.9 (1 reviews)
Author: callstackincubator

For exploration, use snapshot refs. For deterministic replay, use selectors. For structured exploratory QA bug hunts and reporting, use ../dogfood/SKILL.md.

Start Here (Read This First)

Use this skill as a router, not a full manual.

Pick one mode:
- Normal interaction flow
- Debug/crash flow
- Replay maintenance flow
Run one canonical flow below.
Open references only if blocked.

Decision Map

No target context yet: devices -> pick target -> open.
Normal UI task: open -> snapshot -i -> press/fill -> diff snapshot -i -> close
Debug/crash: open <app> -> logs clear --restart -> reproduce -> network dump -> logs path -> targeted grep
Replay drift: replay -u <path> -> verify updated selectors
Remote multi-tenant run: allocate lease -> run commands with tenant isolation flags -> heartbeat/release lease
Device-scope isolation run: set iOS simulator set / Android allowlist -> run selectors within scope only

Canonical Flows

1) Normal Interaction Flow

bash
1agent-device open Settings --platform ios
2agent-device snapshot -i
3agent-device press @e3
4agent-device diff snapshot -i
5agent-device fill @e5 "test"
6agent-device close

2) Debug/Crash Flow

bash
1agent-device open MyApp --platform ios
2agent-device logs clear --restart
3agent-device network dump 25
4agent-device logs path

Logging is off by default. Enable only for debugging windows. logs clear --restart requires an active app session (open <app> first).

3) Replay Maintenance Flow

bash
1agent-device replay -u ./session.ad

4) Remote Tenant Lease Flow (HTTP JSON-RPC)

bash
1# Allocate lease
2curl -sS http://127.0.0.1:${AGENT_DEVICE_DAEMON_HTTP_PORT}/rpc \
3  -H "content-type: application/json" \
4  -H "Authorization: Bearer <token>" \
5  -d '{"jsonrpc":"2.0","id":"alloc-1","method":"agent_device.lease.allocate","params":{"runId":"run-123","tenantId":"acme","ttlMs":60000}}'
6
7# Use lease in tenant-isolated command execution
8agent-device --daemon-transport http \
9  --tenant acme \
10  --session-isolation tenant \
11  --run-id run-123 \
12  --lease-id <lease-id> \
13  session list --json
14
15# Heartbeat and release
16curl -sS http://127.0.0.1:${AGENT_DEVICE_DAEMON_HTTP_PORT}/rpc \
17  -H "content-type: application/json" \
18  -H "Authorization: Bearer <token>" \
19  -d '{"jsonrpc":"2.0","id":"hb-1","method":"agent_device.lease.heartbeat","params":{"leaseId":"<lease-id>","ttlMs":60000}}'
20curl -sS http://127.0.0.1:${AGENT_DEVICE_DAEMON_HTTP_PORT}/rpc \
21  -H "content-type: application/json" \
22  -H "Authorization: Bearer <token>" \
23  -d '{"jsonrpc":"2.0","id":"rel-1","method":"agent_device.lease.release","params":{"leaseId":"<lease-id>"}}'

Command Skeleton (Minimal)

Session and navigation

bash
1agent-device devices
2agent-device devices --platform ios --ios-simulator-device-set /tmp/tenant-a/simulators
3agent-device devices --platform android --android-device-allowlist emulator-5554,device-1234
4agent-device ensure-simulator --device "iPhone 16" --ios-simulator-device-set /tmp/tenant-a/simulators
5agent-device ensure-simulator --device "iPhone 16" --runtime com.apple.CoreSimulator.SimRuntime.iOS-18-4 --ios-simulator-device-set /tmp/tenant-a/simulators --boot
6agent-device open [app|url] [url]
7agent-device open [app] --relaunch
8agent-device close [app]
9agent-device install <app> <path-to-binary>
10agent-device reinstall <app> <path-to-binary>
11agent-device session list

Use boot only as fallback when open cannot find/connect to a ready target. For Android emulators by AVD name, use boot --platform android --device <avd-name>. For Android emulators without GUI, add --headless. Use --target mobile|tv with --platform (required) to pick phone/tablet vs TV targets (AndroidTV/tvOS).

Isolation scoping quick reference:

--ios-simulator-device-set <path> scopes iOS simulator discovery + command execution to one simulator set.
--android-device-allowlist <serials> scopes Android discovery/selection to comma/space separated serials.
Scope is applied before selectors (--device, --udid, --serial); out-of-scope selectors fail with DEVICE_NOT_FOUND.
With iOS simulator-set scope enabled, iOS physical devices are not enumerated.

Simulator provisioning quick reference:

Use ensure-simulator to create or reuse a named iOS simulator inside a device set before starting a session.
--device <name> is required (e.g. "iPhone 16 Pro"). --runtime <id> pins the runtime; omit to use the newest compatible one.
--boot boots it immediately. Returns udid, device, runtime, ios_simulator_device_set, created, booted.
Idempotent: safe to call repeatedly; reuses an existing matching simulator by default.

TV quick reference:

AndroidTV: open/apps use TV launcher discovery automatically.
TV target selection works on emulators/simulators and connected physical devices (AndroidTV + AppleTV).
tvOS: runner-driven interactions and snapshots are supported (snapshot, wait, press, fill, get, scroll, back, home, app-switcher, record and related selector flows).
tvOS back/home/app-switcher map to Siri Remote actions (menu, home, double-home) in the runner.
tvOS follows iOS simulator-only command semantics for helpers like pinch, settings, and push.

Snapshot and targeting

bash
1agent-device snapshot -i
2agent-device diff snapshot -i
3agent-device find "Sign In" click
4agent-device press @e1
5agent-device fill @e2 "text"
6agent-device is visible 'id="anchor"'

press is canonical tap command; click is an alias.

Utilities

bash
1agent-device appstate
2agent-device clipboard read
3agent-device clipboard write "token"
4agent-device keyboard status
5agent-device keyboard dismiss
6agent-device perf --json
7agent-device network dump [limit] [summary|headers|body|all]
8agent-device push <bundle|package> <payload.json|inline-json>
9agent-device trigger-app-event screenshot_taken '{"source":"qa"}'
10agent-device get text @e1
11agent-device screenshot out.png
12agent-device settings permission grant notifications
13agent-device settings permission reset camera
14agent-device trace start
15agent-device trace stop ./trace.log

Batch (when sequence is already known)

bash
1agent-device batch --steps-file /tmp/batch-steps.json --json

Performance Check

Use agent-device perf --json (or metrics --json) after open.
For detailed metric semantics, caveats, and interpretation guidance, see references/perf-metrics.md.

Guardrails (High Value Only)

Re-snapshot after UI mutations (navigation/modal/list changes).
Prefer snapshot -i; scope/depth only when needed.
Use refs for discovery, selectors for replay/assertions.
find "<query>" click --json returns { ref, locator, query, x, y } — all derived from the matched snapshot node. Do not rely on these fields from raw press/click responses for observability; use find instead.
Use fill for clear-then-type semantics; use type for focused append typing.
Use install for in-place app upgrades (keep app data when platform permits), and reinstall for deterministic fresh-state runs.
App binary format support for install/reinstall: Android .apk/.aab, iOS .app/.ipa.
Android .aab requires bundletool in PATH, or AGENT_DEVICE_BUNDLETOOL_JAR=<path-to-bundletool-all.jar> with java in PATH.
Android .aab optional: set AGENT_DEVICE_ANDROID_BUNDLETOOL_MODE=<mode> to control bundletool build-apks --mode (default: universal).
iOS .ipa: extract/install from Payload/*.app; when multiple app bundles are present, <app> is used as a bundle id/name hint.
iOS appstate is session-scoped; Android appstate is live foreground state. iOS responses include device_udid and ios_simulator_device_set for isolation verification.
iOS open responses include device_udid and ios_simulator_device_set to confirm which simulator handled the session.
Clipboard helpers: clipboard read / clipboard write <text> are supported on Android and iOS simulators; iOS physical devices are not supported yet.
Android keyboard helpers: keyboard status|get|dismiss report keyboard visibility/type and dismiss via keyevent when visible.
network dump is best-effort and parses HTTP(s) entries from the session app log file.
Biometric settings: iOS simulator supports settings faceid|touchid <match|nonmatch|enroll|unenroll>; Android supports settings fingerprint <match|nonmatch> where runtime tooling is available.
For AndroidTV/tvOS selection, always pair --target with --platform (ios, android, or apple alias); target-only selection is invalid.
push simulates notification delivery:
- iOS simulator uses APNs-style payload JSON.
- Android uses broadcast action + typed extras (string/boolean/number).
trigger-app-event requires app-defined deep-link hooks and URL template configuration (AGENT_DEVICE_APP_EVENT_URL_TEMPLATE or platform-specific variants).
trigger-app-event requires an active session or explicit selectors (--platform, --device, --udid, --serial); on iOS physical devices, custom-scheme triggers require active app context.
Canonical trigger behavior and caveats are documented in website/docs/docs/commands.md under App event triggers.
Permission settings are app-scoped and require an active session app: settings permission <grant|deny|reset> <camera|microphone|photos|contacts|notifications> [full|limited]
iOS simulator permission alerts: use alert wait then alert accept/dismiss — accept/dismiss retry internally for up to 2 s so you do not need manual sleeps. See references/permissions.md.
full|limited mode applies only to iOS photos; other targets reject mode.
On Android, non-ASCII fill/type may require an ADB keyboard IME on some system images; only install IME APKs from trusted sources and verify checksum/signature.
If using --save-script, prefer explicit path syntax (--save-script=flow.ad or ./flow.ad).
For tenant-isolated remote runs, always pass --tenant, --session-isolation tenant, --run-id, and --lease-id together.
Use short lease TTLs and heartbeat only while work is active; release leases immediately after run completion/failure.
Env equivalents for scoped runs: AGENT_DEVICE_IOS_SIMULATOR_DEVICE_SET (compat IOS_SIMULATOR_DEVICE_SET) and AGENT_DEVICE_ANDROID_DEVICE_ALLOWLIST (compat ANDROID_DEVICE_ALLOWLIST).

Security and Trust Notes

Prefer a preinstalled agent-device binary over on-demand package execution.
If install is required, pin an exact version (for example: npx --yes agent-device@<exact-version> --help).
Signing/provisioning environment variables are optional, sensitive, and only for iOS physical-device setup.
Logs/artifacts are written under ~/.agent-device; replay scripts write to explicit paths you provide.
For remote daemon mode, prefer AGENT_DEVICE_DAEMON_SERVER_MODE=http|dual with AGENT_DEVICE_HTTP_AUTH_HOOK and tenant-scoped lease admission.
Keep logging off unless debugging and use least-privilege/isolated environments for autonomous runs.

Common Mistakes

Mixing debug flow into normal runs (keep logs off unless debugging).
Continuing to use stale refs after screen transitions.
Using URL opens with Android --activity (unsupported combination).
Treating boot as default first step instead of fallback.

agent-device — Categories.community

↓ Quality Score

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for agent-device MCP Server

! Prerequisites & Limits

# Tags

Mobile Automation with agent-device

Start Here (Read This First)

Decision Map

Canonical Flows

1) Normal Interaction Flow

2) Debug/Crash Flow

3) Replay Maintenance Flow

4) Remote Tenant Lease Flow (HTTP JSON-RPC)

Command Skeleton (Minimal)

Session and navigation

Snapshot and targeting

Utilities

Batch (when sequence is already known)

Performance Check

Guardrails (High Value Only)

Security and Trust Notes

Common Mistakes

References

Related Skills

Looking for an alternative to agent-device or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

widget-generator

chat-sdk

zustand

data-fetching

agent-device — Categories.community

About this Skill

↓ Quality Score

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for agent-device MCP Server

! Prerequisites & Limits

# Tags

Mobile Automation with agent-device

Start Here (Read This First)

Decision Map

Canonical Flows

1) Normal Interaction Flow

2) Debug/Crash Flow

3) Replay Maintenance Flow

4) Remote Tenant Lease Flow (HTTP JSON-RPC)

Command Skeleton (Minimal)

Session and navigation

Snapshot and targeting

Utilities

Batch (when sequence is already known)

Performance Check

Guardrails (High Value Only)

Security and Trust Notes

Common Mistakes

References

Related Skills

Looking for an alternative to agent-device or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

widget-generator

chat-sdk

zustand

data-fetching