KS
Killer-Skills

gpu-cli — gpu-cli install gpu-cli install, gpu-cli setup guide, how to use gpu-cli, what is gpu-cli, gpu-cli vs kubectl, gpu-cli alternative, remote NVIDIA GPU command execution, gpu run command, gpu doctor --json, gpu status --json

v0.14.0
GitHub

About this Skill

Perfect for AI Agents needing remote NVIDIA GPU acceleration for machine learning workflows. gpu-cli is a public-facing CLI tool for running local commands on remote NVIDIA GPUs. It works by prefixing any command with `gpu`, which triggers automatic pod provisioning, code synchronization, log streaming, and output retrieval for machine learning and development workflows.

Features

Runs local commands on remote NVIDIA GPUs via command prefixing (e.g., `gpu run uv run python train.py`)
Automatically provisions a remote pod for execution
Synchronizes local code to the remote environment
Streams logs from the remote pod in real-time
Synchronizes output files back to the local machine
Provides diagnostic commands like `gpu doctor --json` to check setup health

# Core Topics

gpu-cli gpu-cli
[0]
[0]
Updated: 3/7/2026

Quality Score

Top 5%
50
Excellent
Based on code quality & docs
Installation
SYS Universal Install (Auto-Detect)
Cursor IDE Windsurf IDE VS Code IDE
> npx killer-skills add gpu-cli/gpu/gpu-cli

Agent Capability Analysis

The gpu-cli MCP Server by gpu-cli is an open-source Categories.community integration for Claude and other AI agents, enabling seamless task automation and capability expansion. Optimized for gpu-cli install, gpu-cli setup guide, how to use gpu-cli.

Ideal Agent Persona

Perfect for AI Agents needing remote NVIDIA GPU acceleration for machine learning workflows.

Core Value

Empowers agents to run local commands on remote NVIDIA GPUs, stream logs, and sync outputs back, leveraging protocols like uv and provisioning tools, all by prefixing commands with `gpu`.

Capabilities Granted for gpu-cli MCP Server

Automating machine learning model training on remote GPUs
Debugging GPU-accelerated workflows with `gpu doctor` and `gpu status`
Optimizing costs by monitoring running pods and available GPUs with `gpu inventory`

! Prerequisites & Limits

  • Requires NVIDIA GPU access
  • Dependent on `uv` for workflow management
  • Needs proper setup and authentication for remote GPU provisioning
Project
SKILL.md
7.7 KB
.cursorrules
1.2 KB
package.json
240 B
Ready
UTF-8

# Tags

[No tags]
SKILL.md
Readonly

GPU CLI

GPU CLI runs local commands on remote NVIDIA GPUs by prefixing with gpu. It provisions a pod, syncs your code, streams logs, and syncs outputs back: uv run python train.py becomes gpu run uv run python train.py.

Quick diagnostics

bash
1gpu doctor --json # Check if setup is healthy (daemon, auth, provider keys) 2gpu status --json # See running pods and costs 3gpu inventory --json # See available GPUs and pricing

Command families

Getting started

CommandPurpose
gpu loginBrowser-based authentication
gpu logout [-y]Remove session
gpu init [--gpu-type T] [--force]Initialize project config
gpu upgradeOpen subscription upgrade page

Running code

CommandPurpose
gpu run <command>Execute on remote GPU (main command)
gpu run -d <command>Run detached (background)
gpu run -a <job_id>Reattach to running job
gpu run --cancel <job_id>Cancel a running job
gpu status [--json]Show project status, pods, costs
gpu logs [-j JOB] [-f] [--tail N] [--json]View job output
gpu attach <job_id>Reattach to job output stream
gpu stop [POD_ID] [-y]Stop active pod

Key gpu run flags: --gpu-type, --gpu-count <1-8>, --min-vram, --rebuild, -o/--output, --no-output, --sync, -p/--publish <PORT>, -e <KEY=VALUE>, -i/--interactive.

GPU inventory

CommandPurpose
gpu inventory [--available] [--min-vram N] [--max-price P] [--json]List GPUs with pricing

Volumes

CommandPurpose
gpu volume list [--detailed] [--json]List network volumes
gpu volume create [--name N] [--size GB] [--datacenter DC]Create volume
gpu volume delete <VOL> [--force]Delete volume
gpu volume extend <VOL> --size <GB>Increase size
gpu volume set-global <VOL>Set default volume
gpu volume status [--volume V] [--json]Volume usage
gpu volume migrate <VOL> --to <DC>Migrate to datacenter
gpu volume sync <SRC> <DEST>Sync between volumes

Vault (encrypted storage)

CommandPurpose
gpu vault list [--json]List encrypted output files
gpu vault export <PATH> <DEST>Export decrypted file
gpu vault stats [--json]Storage usage stats

Configuration

CommandPurpose
gpu config show [--json]Show merged config
gpu config validateValidate against schema
gpu config schemaPrint JSON schema
gpu config set <KEY> <VALUE>Set global config option
gpu config get <KEY>Get global config value

Authentication

CommandPurpose
gpu auth login [--profile P]Authenticate with cloud provider
gpu auth logoutRemove credentials
gpu auth statusShow auth status
gpu auth add <HUB>Add hub credentials (hf, civitai)
gpu auth remove <HUB>Remove hub credentials
gpu auth hubsList configured hubs

Organizations

CommandPurpose
gpu org listList organizations
gpu org create <NAME>Create organization
gpu org switch [SLUG]Set active org context
gpu org invite <EMAIL>Invite member
gpu org service-account create --name NCreate service token
gpu org service-account listList service accounts
gpu org service-account revoke <ID>Revoke token

LLM inference

CommandPurpose
gpu llm run [--ollama|--vllm] [--model M] [-y]Launch LLM inference
gpu llm info [MODEL] [--url URL] [--json]Show model info

ComfyUI workflows

CommandPurpose
gpu comfyui list [--json]Browse available workflows
gpu comfyui info <WORKFLOW> [--json]Show workflow details
gpu comfyui validate <WORKFLOW> [--json]Pre-flight checks
gpu comfyui run <WORKFLOW>Run workflow on GPU
gpu comfyui generate "<PROMPT>"Text-to-image generation
gpu comfyui stop [WORKFLOW] [--all]Stop ComfyUI pod

Notebooks

CommandPurpose
gpu notebook [FILE] [--run] [--new NAME]Run Marimo notebook on GPU

Alias: gpu nb

Serverless endpoints

CommandPurpose
gpu serverless deploy [--template T] [--json]Deploy endpoint
gpu serverless status [ENDPOINT] [--json]Endpoint status
gpu serverless logs [ENDPOINT]View request logs
gpu serverless list [--json]List all endpoints
gpu serverless delete [ENDPOINT]Delete endpoint
gpu serverless warm [--cpu|--gpu]Pre-warm endpoint

Templates

CommandPurpose
gpu template list [--json]Browse official templates
gpu template clear-cacheClear cached templates

Daemon control

CommandPurpose
gpu daemon status [--json]Show daemon health
gpu daemon startStart daemon
gpu daemon stopStop daemon
gpu daemon restartRestart daemon
gpu daemon logs [-f] [-n N]View daemon logs

Tools and utilities

CommandPurpose
gpu dashboardInteractive TUI for pods and jobs
gpu doctor [--json]Diagnostic checks
gpu agent-docsPrint agent reference to stdout
gpu update [--check]Update CLI
gpu changelog [VERSION]View release notes
gpu issue ["desc"]Report issue
gpu desktopDesktop app management
gpu supportOpen community Discord

Common workflows

  1. Setup: gpu login then gpu init
  2. Run job: gpu run python train.py --epochs 10
  3. With specific GPU: gpu run --gpu-type "RTX 4090" python train.py
  4. Detached job: gpu run -d python long_training.py then gpu status --json
  5. Check status: gpu status --json
  6. View logs: gpu logs --json
  7. Stop pods: gpu stop -y
  8. LLM inference: gpu llm run --ollama --model llama3 -y
  9. ComfyUI: gpu comfyui run flux_schnell
  10. Diagnose issues: gpu doctor --json

gpu run is pod-reuse oriented: after a command completes, the next gpu run reuses the existing pod until you gpu stop or cooldown ends.

JSON output

Most commands support --json for machine-readable output. Structured data goes to stdout; human-oriented status and progress messages go to stderr.

Commands with --json: status, logs, doctor, inventory, config show, daemon status, volume list, volume status, vault list, vault stats, comfyui list, comfyui info, comfyui validate, serverless deploy, serverless status, serverless list, template list, llm info.

Exit codes

CodeMeaningRecovery
0SuccessProceed
1General errorRead stderr
2Usage errorFix command syntax
10Auth requiredgpu auth login
11Quota exceededgpu upgrade or wait
12Not foundCheck resource ID
13Daemon unavailablegpu daemon start, retry
14TimeoutRetry
15CancelledRe-run if needed
130InterruptedRe-run if needed

Configuration

  • Project config: gpu.toml, gpu.jsonc, or pyproject.toml [tool.gpu]
  • Global config: ~/.gpu-cli/config.toml (via gpu config set/get)
  • Sync model: .gitignore controls upload; outputs patterns control download
  • Secrets and credentials must stay in the OS keychain, never plaintext project files
  • CI env vars: GPU_RUNPOD_API_KEY, GPU_SSH_PRIVATE_KEY, GPU_SSH_PUBLIC_KEY

References

  • Project generation and task setup: references/create.md
  • Debugging and common failures: references/debug.md
  • Config schema and field examples: references/config.md
  • Cost and GPU selection guidance: references/optimize.md
  • Persistent storage and volumes: references/volumes.md

Related Skills

Looking for an alternative to gpu-cli or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

View All

widget-generator

Logo of f
f

widget-generator is an open-source AI agent skill for creating widget plugins that are injected into prompt feeds on prompts.chat. It supports two rendering modes: standard prompt widgets using default PromptCard styling and custom render widgets built as full React components.

149.6k
0
Design

chat-sdk

Logo of lobehub
lobehub

chat-sdk is a unified TypeScript SDK for building chat bots across multiple platforms, providing a single interface for deploying bot logic.

73.0k
0
Communication

zustand

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication

data-fetching

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication