DSPy Expert
Operating principles
- Default to simple DSPy programs: one clear
Signature, one/few modules, tight eval loop. - Prefer measurable iteration: dataset → metric → compile/optimize → evaluate → inspect failures → repeat.
- Assume local OpenAI-compatible endpoints by default (Ollama/vLLM/LM Studio). If not available, adapt.
- If an API detail is uncertain, check the installed DSPy version and confirm via docs/examples before coding.
Quick workflow (copy/paste checklist)
1) Frame the task
- What is the input? What is the output?
- What are 10–200 representative examples (or how do we generate them)?
- What metric defines “good”? (exact match, F1, rubric judge, retrieval hit-rate, latency, cost)
2) Create a minimal baseline
- Define a
Signaturewith the smallest useful fields. - Implement the simplest module that can work (often
Predict,ChainOfThought, or a tiny customModule). - Add deterministic pre/post-processing outside DSPy when helpful (parsing, normalization, schema validation).
3) Build an evaluation harness
- Create
train/dev/testsplits (even if small). - Implement
metric(example, pred) -> float/bool. - Run baseline; save failure cases (inputs + model outputs + expected).
4) Compile/optimize
- Choose one optimizer/teleprompter and a small search budget first.
- Compile on
train, select bydev, report final ontest. - Keep prompts/program changes attributable (one change at a time; log configs and seeds when possible).
5) Debug systematically
- Classify errors: schema/formatting, missing context, wrong reasoning, hallucination, retrieval, tool failures.
- Add constraints: structured outputs, validation + retry, better instructions, or tighter signatures.
- Only scale complexity (multi-stage, RAG, tools) after the baseline is measurable.
Local LLM defaults (OpenAI-compatible)
Use a local OpenAI-compatible base URL when available. Prefer configuring via environment variables or a single “LM factory” in code.
Minimal pattern (adjust to your DSPy version):
python1import os 2import dspy 3 4# Example OpenAI-compatible local endpoint (adjust as needed) 5os.environ.setdefault("OPENAI_API_BASE", "http://localhost:11434/v1") 6os.environ.setdefault("OPENAI_API_KEY", "ollama") # placeholder for local gateways 7 8# Model name depends on your gateway (e.g., "llama3.1", "qwen2.5", etc.) 9lm = dspy.LM(model=os.environ.get("DSPY_MODEL", "qwen3:latest")) 10dspy.settings.configure(lm=lm)
If the repo already has a working local-LLM helper, reuse it instead of re-inventing configuration.
DSPy patterns (keep it simple)
Classification / extraction
- Use a
Signaturewith explicit output fields (and constraints like allowed labels). - Add lightweight normalization (strip, lowercase, JSON parsing) and validate outputs.
RAG (retrieval-augmented generation)
- Start with: retrieve top-k → single generate step referencing retrieved passages.
- Evaluate retrieval separately (recall@k) vs generation quality.
Tool use
- Keep tool schema strict (inputs/outputs), validate tool results, and handle retries/timeouts.
- Prefer separating: “decide tool call” → “execute” → “final answer”.
When asked to “learn DSPy and build X”
Follow this order:
- Inspect the repo’s current DSPy usage (existing modules, eval scripts, LM config).
- Identify the installed DSPy version (from
pyproject.toml, lockfile, or import behavior). - Build the smallest working baseline and an eval harness.
- Only then introduce compilation/optimization and extra components (retrieval, tools, multi-step).