KS
Killer-Skills

langfuse — Categories.community

v1.0.2
GitHub

About this Skill

Ideal for AI Agents requiring advanced observability and debugging capabilities, such as those built with Claude Code or AutoGPT, to leverage Langfuse trace data A Model Context Protocol (MCP) server for Langfuse, enabling AI agents to query Langfuse trace data for enhanced debugging and observability

avivsinai avivsinai
[0]
[0]
Updated: 3/5/2026

Quality Score

Top 5%
50
Excellent
Based on code quality & docs
Installation
SYS Universal Install (Auto-Detect)
Cursor IDE Windsurf IDE VS Code IDE
> npx killer-skills add avivsinai/langfuse-mcp

Agent Capability Analysis

The langfuse MCP Server by avivsinai is an open-source Categories.community integration for Claude and other AI agents, enabling seamless task automation and capability expansion.

Ideal Agent Persona

Ideal for AI Agents requiring advanced observability and debugging capabilities, such as those built with Claude Code or AutoGPT, to leverage Langfuse trace data

Core Value

Empowers agents to query Langfuse trace data for enhanced debugging and observability, utilizing the Model Context Protocol (MCP) server and enabling features like exception tracking and performance analysis through APIs and datasets

Capabilities Granted for langfuse MCP Server

Debugging AI system failures
Analyzing performance bottlenecks
Tracking exceptions in Langfuse traces
Evaluating dataset quality

! Prerequisites & Limits

  • Requires Langfuse API credentials
  • Dependent on MCP server installation
  • Self-hosted instances need custom LANGFUSE_HOST setup
Project
SKILL.md
4.9 KB
.cursorrules
1.2 KB
package.json
240 B
Ready
UTF-8

# Tags

[No tags]
SKILL.md
Readonly

Langfuse Skill

Debug your AI systems through Langfuse observability.

Triggers: langfuse, traces, debug AI, find exceptions, set up langfuse, what went wrong, why is it slow, datasets, evaluation sets

Setup

Step 1: Get credentials from https://cloud.langfuse.com → Settings → API Keys

If self-hosted, use your instance URL for LANGFUSE_HOST and create keys there.

Step 2: Install MCP (pick one):

bash
1# Claude Code (project-scoped, shared via .mcp.json) 2claude mcp add \ 3 --scope project \ 4 --env LANGFUSE_PUBLIC_KEY=pk-... \ 5 --env LANGFUSE_SECRET_KEY=sk-... \ 6 --env LANGFUSE_HOST=https://cloud.langfuse.com \ 7 langfuse -- uvx --python 3.11 langfuse-mcp 8 9# Codex CLI (user-scoped, stored in ~/.codex/config.toml) 10codex mcp add langfuse \ 11 --env LANGFUSE_PUBLIC_KEY=pk-... \ 12 --env LANGFUSE_SECRET_KEY=sk-... \ 13 --env LANGFUSE_HOST=https://cloud.langfuse.com \ 14 -- uvx --python 3.11 langfuse-mcp

Step 3: Restart CLI, verify with /mcp (Claude) or codex mcp list (Codex)

Step 4: Test: fetch_traces(age=60)

Read-Only Mode

For safer observability without risk of modifying prompts or datasets, enable read-only mode:

bash
1# CLI flag 2langfuse-mcp --read-only 3 4# Or environment variable 5LANGFUSE_MCP_READ_ONLY=true

This disables write tools: create_text_prompt, create_chat_prompt, update_prompt_labels, create_dataset, create_dataset_item, delete_dataset_item.

For manual .mcp.json setup or troubleshooting, see references/setup.md.


Playbooks

"Where are the errors?"

find_exceptions(age=1440, group_by="file")

→ Shows error counts by file. Pick the worst offender.

find_exceptions_in_file(filepath="src/ai/chat.py", age=1440)

→ Lists specific exceptions. Grab a trace_id.

get_exception_details(trace_id="...")

→ Full stacktrace and context.


"What happened in this interaction?"

fetch_traces(age=60, user_id="...")

→ Find the trace. Note the trace_id.

If you don't know the user_id, start with:

fetch_traces(age=60)
fetch_trace(trace_id="...", include_observations=true)

→ See all LLM calls in the trace.

fetch_observation(observation_id="...")

→ Inspect a specific generation's input/output.


"Why is it slow?"

fetch_observations(age=60, type="GENERATION")

→ Find recent LLM calls. Look for high latency.

fetch_observation(observation_id="...")

→ Check token counts, model, timing.


"What's this user experiencing?"

get_user_sessions(user_id="...", age=1440)

→ List their sessions.

get_session_details(session_id="...")

→ See all traces in the session.


"Manage datasets"

list_datasets()

→ See all datasets.

get_dataset(name="evaluation-set-v1")

→ Get dataset details.

list_dataset_items(dataset_name="evaluation-set-v1", page=1, limit=10)

→ Browse items in the dataset.

create_dataset(name="qa-test-cases", description="QA evaluation set")

→ Create a new dataset.

create_dataset_item(
  dataset_name="qa-test-cases",
  input={"question": "What is 2+2?"},
  expected_output={"answer": "4"}
)

→ Add test cases.

create_dataset_item(
  dataset_name="qa-test-cases",
  item_id="item_123",
  input={"question": "What is 3+3?"},
  expected_output={"answer": "6"}
)

→ Upsert: updates existing item by id or creates if missing.


"Manage prompts"

list_prompts()

→ See all prompts with labels.

get_prompt(name="...", label="production")

→ Fetch current production version.

create_text_prompt(name="...", prompt="...", labels=["staging"])

→ Create new version in staging.

update_prompt_labels(name="...", version=N, labels=["production"])

→ Promote to production. (Rollback = re-apply label to older version)


Quick Reference

TaskTool
List tracesfetch_traces(age=N)
Get trace detailsfetch_trace(trace_id="...", include_observations=true)
List LLM callsfetch_observations(age=N, type="GENERATION")
Get observationfetch_observation(observation_id="...")
Error countget_error_count(age=N)
Find exceptionsfind_exceptions(age=N, group_by="file")
List sessionsfetch_sessions(age=N)
User sessionsget_user_sessions(user_id="...", age=N)
List promptslist_prompts()
Get promptget_prompt(name="...", label="production")
List datasetslist_datasets()
Get datasetget_dataset(name="...")
List dataset itemslist_dataset_items(dataset_name="...", limit=N)
Create/update dataset itemcreate_dataset_item(dataset_name="...", item_id="...")

age = minutes to look back (max 10080 = 7 days)


References

  • references/tool-reference.md — Full parameter docs, filter semantics, response schemas
  • references/setup.md — Manual setup, troubleshooting, advanced configuration

Related Skills

Looking for an alternative to langfuse or building a Categories.community AI Agent? Explore these related open-source MCP Servers.

View All

widget-generator

Logo of f
f

widget-generator is an open-source AI agent skill for creating widget plugins that are injected into prompt feeds on prompts.chat. It supports two rendering modes: standard prompt widgets using default PromptCard styling and custom render widgets built as full React components.

149.6k
0
Design

chat-sdk

Logo of lobehub
lobehub

chat-sdk is a unified TypeScript SDK for building chat bots across multiple platforms, providing a single interface for deploying bot logic.

73.0k
0
Communication

zustand

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication

data-fetching

Logo of lobehub
lobehub

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction.

72.8k
0
Communication