phoneagent — for Claude Code phoneagent, community, for Claude Code, ide skills, 127.0.0.1:45678, start_rpc_bridge_local.sh, pymobiledevice3, PHONEAGENT_RPC_READY, rpc.py --port <port>, xcrun devicectl list devices

v1.0.0

Об этом навыке

Подходящий сценарий: Ideal for AI agents that need use this workflow to drive ios or android ui through phoneagent's json-rpc bridge. Локализованное описание: An AI agent that can get things done across iPhone apps.

Возможности

Use this workflow to drive iOS or Android UI through PhoneAgent's JSON-RPC bridge.
All shell commands below assume you are in the repo root:
cd "$(git rev-parse --show-toplevel)"
Start the RPC bridge
Choose a platform bridge (both listen on 127.0.0.1:45678 by default).

# Core Topics

rounak rounak
[744]
[89]
Updated: 4/7/2026

Killer-Skills Review

Decision support comes first. Repository text comes second.

Reference-Only Page Review Score: 10/11

This page remains useful for teams, but Killer-Skills treats it as reference material instead of a primary organic landing page.

Original recommendation layer Concrete use-case guidance Explicit limitations and caution Quality floor passed for review
Review Score
10/11
Quality Score
54
Canonical Locale
en
Detected Body Locale
en

Подходящий сценарий: Ideal for AI agents that need use this workflow to drive ios or android ui through phoneagent's json-rpc bridge. Локализованное описание: An AI agent that can get things done across iPhone apps.

Зачем использовать этот навык

Рекомендация: phoneagent helps agents use this workflow to drive ios or android ui through phoneagent's json-rpc bridge. An AI agent that can get things done across iPhone apps.

Подходит лучше всего

Подходящий сценарий: Ideal for AI agents that need use this workflow to drive ios or android ui through phoneagent's json-rpc bridge.

Реализуемые кейсы использования for phoneagent

Сценарий использования: Applying Use this workflow to drive iOS or Android UI through PhoneAgent's JSON-RPC bridge
Сценарий использования: Applying All shell commands below assume you are in the repo root:
Сценарий использования: Applying cd "$(git rev-parse --show-toplevel)"

! Безопасность и ограничения

  • Ограничение: start rpc bridge local.sh starts a localhost-only forwarder.
  • Ограничение: Both bridges are localhost-only.
  • Ограничение: If you need to forward manually, first get a device UDID via xcrun devicectl list devices, then run:

Why this page is reference-only

  • - Current locale does not satisfy the locale-governance contract.

Source Boundary

The section below is imported from the upstream repository and should be treated as secondary evidence. Use the Killer-Skills review above as the primary layer for fit, risk, and installation decisions.

After The Review

Decide The Next Action Before You Keep Reading Repository Material

Killer-Skills should not stop at opening repository instructions. It should help you decide whether to install this skill, when to cross-check against trusted collections, and when to move into workflow rollout.

Labs Demo

Browser Sandbox Environment

⚡️ Ready to unleash?

Experience this Agent in a zero-setup browser environment powered by WebContainers. No installation required.

Boot Container Sandbox

FAQ & Installation Steps

These questions and steps mirror the structured data on this page for better search understanding.

? Frequently Asked Questions

What is phoneagent?

Подходящий сценарий: Ideal for AI agents that need use this workflow to drive ios or android ui through phoneagent's json-rpc bridge. Локализованное описание: An AI agent that can get things done across iPhone apps.

How do I install phoneagent?

Run the command: npx killer-skills add rounak/PhoneAgent/phoneagent. It works with Cursor, Windsurf, VS Code, Claude Code, and 19+ other IDEs.

What are the use cases for phoneagent?

Key use cases include: Сценарий использования: Applying Use this workflow to drive iOS or Android UI through PhoneAgent's JSON-RPC bridge, Сценарий использования: Applying All shell commands below assume you are in the repo root:, Сценарий использования: Applying cd "$(git rev-parse --show-toplevel)".

Which IDEs are compatible with phoneagent?

This skill is compatible with Cursor, Windsurf, VS Code, Trae, Claude Code, OpenClaw, Aider, Codex, OpenCode, Goose, Cline, Roo Code, Kiro, Augment Code, Continue, GitHub Copilot, Sourcegraph Cody, and Amazon Q Developer. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for phoneagent?

Ограничение: start rpc bridge local.sh starts a localhost-only forwarder.. Ограничение: Both bridges are localhost-only.. Ограничение: If you need to forward manually, first get a device UDID via xcrun devicectl list devices, then run:.

How To Install

  1. 1. Open your terminal

    Open the terminal or command line in your project directory.

  2. 2. Run the install command

    Run: npx killer-skills add rounak/PhoneAgent/phoneagent. The CLI will automatically detect your IDE or AI agent and configure the skill.

  3. 3. Start using the skill

    The skill is now active. Your AI agent can use phoneagent immediately in the current project.

! Reference-Only Mode

This page remains useful for installation and reference, but Killer-Skills no longer treats it as a primary indexable landing page. Read the review above before relying on the upstream repository instructions.

Upstream Repository Material

The section below is imported from the upstream repository and should be treated as secondary evidence. Use the Killer-Skills review above as the primary layer for fit, risk, and installation decisions.

Upstream Source

phoneagent

Install phoneagent, an AI agent skill for AI agent workflows and automation. Review the use cases, limitations, and setup path before rollout.

SKILL.md
Readonly
Upstream Repository Material
The section below is imported from the upstream repository and should be treated as secondary evidence. Use the Killer-Skills review above as the primary layer for fit, risk, and installation decisions.
Supporting Evidence

PhoneAgent

Use this workflow to drive iOS or Android UI through PhoneAgent's JSON-RPC bridge.

All shell commands below assume you are in the repo root:

bash
1cd "$(git rev-parse --show-toplevel)"

Start the RPC bridge

  1. Choose a platform bridge (both listen on 127.0.0.1:45678 by default).
bash
1# iOS (XCTest-hosted bridge) 2./.agents/skills/phoneagent/scripts/start_rpc_bridge_local.sh 3 4# Android (adb bridge; emulator or physical device) 5./.agents/skills/phoneagent/scripts/start_android_rpc_bridge_local.sh

Notes:

  • start_rpc_bridge_local.sh is interactive and will show a numbered list of iOS devices/simulators. Enter the number for the destination you want.
  • start_rpc_bridge_local.sh starts a localhost-only forwarder.
  • On Xcode "Connect via network", it uses the CoreDevice tunnel automatically (no extra deps).
  • For USB fallback forwarding, install pymobiledevice3 into a local venv: python3 -m venv .venv && ./.venv/bin/python -m pip install -U pip && ./.venv/bin/python -m pip install pymobiledevice3
  • start_android_rpc_bridge_local.sh uses adb; if multiple devices are connected it prompts for the serial.
  1. Keep the bridge process running.
  2. Wait for PHONEAGENT_RPC_READY ... in logs before sending RPC calls.
  3. Confirm socket readiness before first RPC:
bash
1./.agents/skills/phoneagent/scripts/rpc.py get-tree >/dev/null && echo rpc-ready

Resolve host and port

  1. Always use 127.0.0.1:45678 as the RPC endpoint (or rpc.py --port <port> if customized).

Notes:

  • Both bridges are localhost-only.
  • iOS physical-device flow uses a localhost forwarder.
  • If you need to forward manually, first get a device UDID via xcrun devicectl list devices, then run: python3 ./.agents/skills/phoneagent/scripts/forward_rpc_localhost.py --udid <UDID> (binds 127.0.0.1:45678)

Send RPC calls

Use the helper CLI:

bash
1# iOS bundle identifier 2./.agents/skills/phoneagent/scripts/rpc.py open-app com.apple.Preferences 3 4# Android package name 5./.agents/skills/phoneagent/scripts/rpc.py open-app com.android.settings 6./.agents/skills/phoneagent/scripts/rpc.py get-tree | head 7 8# Use coordinates copied from the tree (XCUI frame string). 9./.agents/skills/phoneagent/scripts/rpc.py enter-text \ 10 --coordinate '{{33.0, 861.0}, {364.0, 38.0}}' \ 11 --text 'Display' 12 13./.agents/skills/phoneagent/scripts/rpc.py tap-element \ 14 --coordinate '{{37.7, 969.7}, {199.7, 29.0}}'

Core operating loop

  1. Call get_tree.
  2. Identify the best target element in the tree (label/identifier) and copy its frame coordinate string.
  3. Prefer coordinate-based actions (tap_element / enter_text).
  4. Use the returned tree from the action response to verify the UI changed as expected.
  5. Repeat until complete.
  6. When the task is complete, always capture a screenshot for the user:
    • Prefer get_context and write result.screenshot_base64 to a PNG (or use ./.agents/skills/phoneagent/scripts/rpc.py get-screen-image, which writes PNG files to /tmp/phoneagent-artifacts).
    • Include the PNG path in your final message so the user can open it.

Use swipe to reveal off-screen content, then use the returned tree (or call get_tree if needed). Use one request at a time per server. Do not fire concurrent batches. Split long keyboard input into chunks; do not send giant enter_text payloads in one call.

RPC method reference

All RPC requests are newline-delimited JSON objects with this shape:

json
1{"id":1,"method":"<method>","params":{...}}

All success responses look like:

json
1{"id":1,"result":{...}}

get_tree

  • Does: Returns the accessibility tree of the currently focused app.
  • Params: none.
  • Returns: {"tree": "<string>"}

Example:

json
1{"id":1,"method":"get_tree","params":{}}

get_screen_image

  • Does: Captures the current screen as a base64-encoded PNG plus image dimensions (when available).
  • Params: none.
  • Returns: {"screenshot_base64":"<base64>","metadata":{"width":<number>,"height":<number>}}

Example:

json
1{"id":2,"method":"get_screen_image","params":{}}

get_context

  • Does: Convenience method that returns both the current accessibility tree and the current screen image.
  • Params: none.
  • Returns: {"tree":"<string>","screenshot_base64":"<base64>","metadata":{"width":<number>,"height":<number>}}

Example:

json
1{"id":3,"method":"get_context","params":{}}

open_app

  • Does: Brings the specified app to the foreground (and makes it the focused app for subsequent calls).
  • Params: bundle_identifier (string, required).
    • iOS: pass bundle identifier (example com.apple.Preferences).
    • Android: pass package name (example com.android.settings).
  • Returns: {"bundle_identifier":"<string>", "tree":"<string>"} (Android also includes package_name).

Example:

json
1{"id":4,"method":"open_app","params":{"bundle_identifier":"com.apple.Preferences"}}

tap

  • Does: Taps an absolute point in the current app.
  • Params: x (number, required), y (number, required). Coordinates are in absolute screen points as reported by the tree.
  • Returns: {"tree":"<string>"}

Example:

json
1{"id":5,"method":"tap","params":{"x":120,"y":300}}

tap_element

  • Does: Taps the center of an element using its XCUI frame string from the accessibility tree.
  • Params:
  • coordinate (string, required). Must look like {{x, y}, {w, h}} (copied from the tree).
  • count (integer, optional; default 1). Use 2 for double-tap.
  • longPress (boolean, optional; default false). When true, performs a long-press gesture.
  • Returns: {"coordinate":"<string>", "count":<number>, "longPress":<bool>, "tree":"<string>"}

Example:

json
1{"id":6,"method":"tap_element","params":{"coordinate":"{{20.0, 165.0}, {390.0, 90.0}}","count":1,"longPress":false}}

enter_text

  • Does: Taps the center of the target element (to focus it), waits briefly for the keyboard, then types the provided text followed by a newline (Return).
  • Params:
  • coordinate (string, required). Must look like {{x, y}, {w, h}} (copied from the tree).
  • text (string, required).
  • Returns: {"coordinate":"<string>", "tree":"<string>"}

Example:

json
1{"id":7,"method":"enter_text","params":{"coordinate":"{{33.0, 861.0}, {364.0, 38.0}}","text":"hello"}}

scroll

  • Does: Scrolls by dragging from a starting point by the provided deltas.
  • Params: x (number, required), y (number, required), distanceX (number, required), distanceY (number, required).
  • Returns: {"tree":"<string>"}

Example:

json
1{"id":8,"method":"scroll","params":{"x":215,"y":760,"distanceX":0,"distanceY":-460}}

swipe

  • Does: Swipes in a direction starting from a given point (implemented as a bounded drag gesture).
  • Params: x (number, required), y (number, required), direction (string, required; one of up, down, left, right).
  • Returns: {"tree":"<string>"}

Example:

json
1{"id":9,"method":"swipe","params":{"x":215,"y":760,"direction":"up"}}

stop

  • Does: Stops the RPC server test (ends the xcodebuild test session).
  • Params: none.
  • Returns: {}

Example:

json
1{"id":10,"method":"stop","params":{}}

iOS app bundle IDs

  • Settings: com.apple.Preferences
  • Camera: com.apple.camera
  • Photos: com.apple.mobileslideshow
  • Messages: com.apple.MobileSMS
  • Home Screen: com.apple.springboard

Android package names

  • Settings: com.android.settings
  • Camera (AOSP): com.android.camera2
  • Photos (Google): com.google.android.apps.photos
  • Messages (Google): com.google.android.apps.messaging
  • Home Screen: launcher package varies by emulator/device

Recovery playbook

  1. If RPC hangs after open_app, restart the test-hosted server and retry with a known-good bundle id.
  2. If taps fail due stale UI, call get_tree again and recalculate target.
  3. If iOS bridge becomes unresponsive, stop/restart xcodebuild test and resume from latest verified app state.
  4. If Android bridge becomes unresponsive, restart adb (adb kill-server && adb start-server), relaunch the bridge, and retry.

End session

  1. Send stop only when the task is complete.
  2. If stop is not sent, terminate the xcodebuild session manually.

Связанные навыки

Looking for an alternative to phoneagent or another community skill for your workflow? Explore these related open-source skills.

Показать все

openclaw-release-maintainer

Logo of openclaw
openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

widget-generator

Logo of f
f

Создание настраиваемых плагинов виджетов для системы ленты новостей prompts.chat

flags

Logo of vercel
vercel

Фреймворк React

138.4k
0
Браузер

pr-review

Logo of pytorch
pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

98.6k
0
Разработчик