What is kubernetes-operations?

Perfect for Cloud Native Agents needing advanced Kubernetes troubleshooting and monitoring capabilities. kubernetes-operations is a specialized AI agent skill for streamlining Kubernetes debugging and operations, utilizing key technologies like Grafana and kubectl.

How do I install kubernetes-operations?

Run the command: npx killer-skills add hippocampus-dev/hippocampus/reference/queries.md. It works with Cursor, Windsurf, VS Code, Claude Code, and 15+ other IDEs.

What are the use cases for kubernetes-operations?

Key use cases include: Debugging pod startup failures using kubectl, Analyzing traces and metrics with Grafana, Investigating in-container process and network issues.

Which IDEs are compatible with kubernetes-operations?

This skill is compatible with Cursor, Windsurf, VS Code, Claude Code, GitHub Copilot, JetBrains, Cline, Roo Code, and many more. Use the Killer-Skills CLI for universal one-command installation.

Are there any limitations for kubernetes-operations?

Requires access to Grafana at https://grafana.minikube.127.0.0.1.nip.io. Needs kubectl for command-line interactions. ArgoCD selfHeal must be disabled before manual changes.

kubernetes-operations: Expert Debugging Setup Guide

Name: kubernetes-operations
Availability: InStock
Rating: 2.1 (1 reviews)
Author: hippocampus-dev

Access Grafana at https://grafana.minikube.127.0.0.1.nip.io
Disable ArgoCD selfHeal before manual changes, re-enable after

Debugging

Investigating	Tool	Entry Point
Traces, metrics, logs, profiles	Grafana	`https://grafana.minikube.127.0.0.1.nip.io`
Pod startup failures, events	kubectl	`kubectl get events -n <namespace>`
In-container process/network	kubectl debug	Ephemeral container

Observability Investigation (via Grafana)

Use Grafana for all observability signals. Do NOT query backends (Tempo, Mimir, Loki) directly.

Get query parameters - Check cluster/manifests/<app>/ for namespace, labels, OTEL_SERVICE_NAME (see Queries for parameter locations)
Open Grafana - Access https://grafana.minikube.127.0.0.1.nip.io in browser
Select datasource and query - Use appropriate signal:

Signal	Backend	Datasource	Query Language	Use Case
Traces	Tempo	Tempo	TraceQL	Request flow, latency, access destinations
Metrics	Mimir (Prometheus)	Mimir	PromQL	Resource usage, HTTP rates, alerting
Logs	Loki	Loki	LogQL	Error investigation, audit
Profiles	Pyroscope	Pyroscope	Flamegraph UI	CPU/memory hotspots
Probes	Blackbox Exporter	Mimir	PromQL	Endpoint reachability

Symptom	Signal	Query Approach
Errors in logs	Loki → Tempo	Extract traceid from logs, trace in Tempo
Latency/5xx	Tempo	Search traces with `status = error`
Unknown outbound dependencies	Tempo	Search traces, inspect spans for outbound calls
Resource saturation	Mimir	Query CPU/memory metrics
High CPU/memory	Pyroscope	Check flamegraphs

Pod Direct Investigation (via kubectl)

Use kubectl when Grafana cannot answer the question (e.g., pod not starting, container-level inspection).

Symptom	Action
Pod not starting	`kubectl get events -n <namespace>`
CrashLoopBackOff	`kubectl logs <pod> -n <namespace> --previous`
Network connectivity	`kubectl debug` with ephemeral container
Process inspection	`kubectl debug` with ephemeral container

Ephemeral Container

bash
1kubectl debug <pod-name> -n <namespace> \
2  --profile=restricted \
3  --image=ghcr.io/hippocampus-dev/hippocampus/ephemeral-container:main \
4  --target=<container-name> \
5  -- <command>

Note: Do not use -it flag when executing commands. It causes output streaming issues.

Manual Changes

Required when directly modifying live cluster resources (e.g., kubectl apply, kubectl patch, kubectl delete) outside of the GitOps workflow.

Operation	Requires Manual Changes
`kubectl apply/patch/delete`	Yes
Debugging (read-only: logs, events, debug)	No
Grafana queries	No
Editing manifests in repo	No (ArgoCD syncs automatically)

ArgoCD selfHeal Control

ArgoCD reverts manual changes unless selfHeal is disabled first. Always re-enable after.

bash
1# Disable selfHeal
2kubectl patch application <app-name> -n argocd --type=merge \
3  -p '{"spec":{"syncPolicy":{"selfHeal":false}}}'
4
5# Re-enable selfHeal (after work is complete)
6kubectl patch application <app-name> -n argocd --type=merge \
7  -p '{"spec":{"syncPolicy":{"selfHeal":true}}}'

Observability Stack Manifests

Component	Path
Grafana	`cluster/manifests/grafana/`
Tempo	`cluster/manifests/tempo/`
Mimir	`cluster/manifests/mimir/`
Loki	`cluster/manifests/loki/`
Pyroscope	`cluster/manifests/pyroscope/`
Prometheus	`cluster/manifests/prometheus/`
Fluentd	`cluster/manifests/fluentd/`
OpenTelemetry	`cluster/manifests/otel-agent/`, `cluster/manifests/otel-collector/`

Reference

If writing observability queries: See Queries

# Core Topics

↓ Quality Score

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for kubernetes-operations MCP Server

! Prerequisites & Limits

# Tags

Debugging

Observability Investigation (via Grafana)

Pod Direct Investigation (via kubectl)

Ephemeral Container

Manual Changes

ArgoCD selfHeal Control

Observability Stack Manifests

Reference

Related Skills

Looking for an alternative to kubernetes-operations or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

widget-generator

chat-sdk

zustand

data-fetching

About this Skill

Features

# Core Topics

↓ Quality Score

Agent Capability Analysis

Ideal Agent Persona

Core Value

↓ Capabilities Granted for kubernetes-operations MCP Server

! Prerequisites & Limits

# Tags

Debugging

Observability Investigation (via Grafana)

Pod Direct Investigation (via kubectl)

Ephemeral Container

Manual Changes

ArgoCD selfHeal Control

Observability Stack Manifests

Reference

Related Skills

Looking for an alternative to kubernetes-operations or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

widget-generator

chat-sdk

zustand

data-fetching