OAK
Prefer MCP tools (oak_search, oak_context, oak_remember, oak_resolve_memory) over CLI when available. Fall back to CLI or direct SQL for queries not covered by MCP tools. If .oak/ci/activities.db does not exist, inform the user to run oak-dev team start first.
Quick Start
# Search knowledge — memories, plans, and sessions (default)
oak_search(query="authentication refactor decision")
# Narrow by category
oak_search(query="form validation logic", search_type="code")
oak_search(query="backup directory decision", search_type="memory")
oak_search(query="migration plan", search_type="plans")
# Impact analysis — get context for a specific file
oak_context(task="impact of changes to executor", files=["src/features/agent_runtime/executor.py"])
# Store an observation
oak_remember(observation="The backup dir uses a 3-tier priority", memory_type="decision")
# Mark a resolved gotcha
oak_resolve_memory(id="<uuid from oak_search>")
CLI fallback
bash
1# Semantic search
2oak-dev ci search "retry with exponential backoff" --type code
3
4# Browse memories by type
5oak-dev ci memories --type decision
6
7# Get impact context for a specific file
8oak-dev ci context "impact of changes" -f src/services/auth.py
Direct SQL (for aggregations, history, custom queries)
bash
1# Recent sessions
2sqlite3 -readonly -header -column .oak/ci/activities.db \
3 "SELECT id, agent, title, status, datetime(created_at_epoch, 'unixepoch', 'localtime') as started FROM sessions ORDER BY created_at_epoch DESC LIMIT 5;"
4
5# Query anything
6sqlite3 -readonly -header -column .oak/ci/activities.db "SELECT count(*) FROM sessions;"
Commands Reference
CLI commands
| Command | Purpose |
|---|
oak-dev ci search "query" --type code | Semantic vector search for code |
oak-dev ci search "query" --type memory | Semantic search for memories |
oak-dev ci search "query" -n 20 | Broader search with more results |
oak-dev ci context "task" -f <file> | Get context for current work |
oak-dev ci remember "observation" | Store a memory (NOT via SQL) |
oak-dev ci memories --type gotcha | Browse memories by type |
oak-dev ci memories --status active | Browse memories by lifecycle status |
oak-dev ci resolve <id> | Mark observation as resolved |
oak-dev ci resolve --session <id> | Bulk-resolve all observations from a session |
oak-dev ci sessions | List session summaries |
oak-dev team status | Check daemon status |
| MCP Tool | CLI Equivalent | Purpose |
|---|
oak_search | oak-dev ci search "query" | Semantic vector search |
oak_remember | oak-dev ci remember "observation" | Store a memory |
oak_context | oak-dev ci context "task" | Get task-relevant context |
oak_resolve_memory | oak-dev ci resolve <uuid> | Mark observation resolved/superseded (UUID from oak_search) |
Direct SQL
bash
1sqlite3 -readonly -header -column .oak/ci/activities.db "YOUR QUERY HERE"
When to Use What
| Need | Tool | Example |
|---|
| Find similar implementations | oak-dev ci search --type code | "retry with exponential backoff" |
| Understand component relationships | oak-dev ci context | "how auth middleware relates to session handling" |
| Assess refactoring risk | oak-dev ci search --type code -n 20 | "PaymentProcessor error handling" |
| Find past decisions/gotchas | oak-dev ci search --type memory | "gotchas with auth changes" |
| Recall previous discussions | sqlite3 -readonly | SELECT title, summary FROM sessions WHERE ... |
| Find what was done before | oak-dev ci memories / sqlite3 | "what did we decide about caching?" |
| Query session history | sqlite3 -readonly | SELECT * FROM sessions ORDER BY ... |
| Aggregate usage stats | sqlite3 -readonly | SELECT agent_name, sum(cost_usd) FROM agent_runs ... |
| Resolve stale observations | oak-dev ci resolve | After completing work that addresses a gotcha |
| Find unresolved planning items | sqlite3 -readonly | SELECT ... WHERE status='active' AND session_origin_type='planning' |
| Run automated analysis | oak-dev ci agent run | oak-dev ci agent run usage-report |
Why Semantic Search Over Grep
| Grep | Semantic Search |
|---|
| Finds "UserService" literally | Finds code about user management regardless of naming |
| Misses synonyms (auth vs authentication) | Understands concepts are related |
| Can't find "conceptually similar" code | Groups code by purpose, not text |
| No relevance ranking | Returns most relevant first |
Core Tables Overview
<!-- BEGIN GENERATED CORE TABLES -->
| Table | Purpose | Key Columns |
|---|
memory_observations | Extracted memories/learnings | observation, memory_type, status, context, tags, importance, session_origin_type |
sessions | Coding sessions (launch to exit) | id, agent, status, summary, title, title_manually_edited, started_at, created_at_epoch |
prompt_batches | User prompts within sessions | session_id, user_prompt, classification, response_summary |
activities | Raw tool executions | session_id, tool_name, file_path, success, error_message |
agent_runs | CI agent executions | agent_name, task, status, result, cost_usd, turns_used |
session_link_events | Session linking analytics | session_id, event_type, old_parent_id, new_parent_id |
session_relationships | Semantic session relationships | session_a_id, session_b_id, relationship_type, similarity_score |
agent_schedules | Cron scheduling state | task_name, cron_expression, enabled, additional_prompt, last_run_at, next_run_at |
resolution_events | Cross-machine resolution propagation | observation_id, action, source_machine_id, applied, content_hash |
governance_audit_events | Audit trail for governance actions | session_id, agent, tool_name, action, rule_id, enforcement_mode, created_at |
team_outbox | Outbound sync queue for team relay | event_type, payload, source_machine_id, content_hash, status, created_at |
team_pull_cursor | Inbound sync cursor per relay | server_url, cursor_value, updated_at |
team_sync_state | Team relay sync metadata | key, value, updated_at |
team_reconcile_state | Per-machine reconciliation tracking | machine_id, last_reconcile_at, last_hash_count, last_missing_count |
<!-- END GENERATED CORE TABLES -->
Memory Types
The memory_type column in memory_observations uses these values:
gotcha — Non-obvious behavior or quirk
bug_fix — Solution to a bug with root cause
decision — Architectural/design decision with rationale
discovery — General insight about the codebase
trade_off — Trade-off that was made and why
session_summary — LLM-generated session summary
Observation Status
The status column tracks lifecycle state:
active — Current and relevant (default for all new observations)
resolved — Issue was addressed in a later session
superseded — Replaced by a newer, more accurate observation
Resolving Observations
When oak_search or oak_context surfaces a gotcha, bug_fix, or discovery that you then address during your session, resolve it so future sessions don't see stale guidance:
- Note the observation UUID from the
oak_search results (e.g., "id": "8430042a-1b01-4c86-8026-6ede46cd93d9").
- After completing the fix or addressing the issue, call:
- MCP:
oak_resolve_memory(id="8430042a-1b01-4c86-8026-6ede46cd93d9")
- CLI:
oak-dev ci resolve 8430042a-1b01-4c86-8026-6ede46cd93d9
- For superseded observations (replaced by a better one), use
status="superseded".
When to resolve:
- You fixed a bug that was tracked as a
bug_fix observation
- You addressed a
gotcha (e.g., refactored the problematic code)
- A
discovery about a problem is no longer accurate after your changes
- A
decision was reversed or replaced by a new decision
When NOT to resolve:
- The observation is still accurate even after your changes
- You only partially addressed the issue
- The observation is a permanent architectural insight (e.g., "service X uses eventual consistency")
Session Origin Types
The session_origin_type column classifies how the session that created the observation operated:
planning — Planning-phase session (high read:edit ratio, few modifications)
investigation — Exploration/debugging session (many reads, minimal edits)
implementation — Active coding session (significant file modifications)
mixed — Combined activity patterns
Planning/investigation observations are automatically capped at importance 5.
Essential Queries
Recent Sessions
sql
1SELECT id, agent, title, status,
2 datetime(created_at_epoch, 'unixepoch', 'localtime') as started,
3 prompt_count, tool_count
4FROM sessions
5ORDER BY created_at_epoch DESC
6LIMIT 10;
What Files Were Touched in a Session
sql
1SELECT DISTINCT file_path, tool_name, count(*) as times
2FROM activities
3WHERE session_id = 'SESSION_ID' AND file_path IS NOT NULL
4GROUP BY file_path, tool_name
5ORDER BY times DESC;
Recent Memories
sql
1SELECT memory_type, substr(observation, 1, 150) as observation,
2 context,
3 datetime(created_at_epoch, 'unixepoch', 'localtime') as created
4FROM memory_observations
5ORDER BY created_at_epoch DESC
6LIMIT 20;
Agent Run History
sql
1SELECT agent_name, task, status, turns_used,
2 printf('$%.4f', cost_usd) as cost,
3 datetime(created_at_epoch, 'unixepoch', 'localtime') as created
4FROM agent_runs
5ORDER BY created_at_epoch DESC
6LIMIT 10;
Full-Text Search on Memories
sql
1SELECT m.memory_type, m.observation, m.context
2FROM memory_observations m
3JOIN memories_fts fts ON m.rowid = fts.rowid
4WHERE memories_fts MATCH 'authentication'
5ORDER BY rank
6LIMIT 10;
Scheduled Tasks
sql
1SELECT task_name, enabled, cron_expression, description,
2 datetime(last_run_at_epoch, 'unixepoch', 'localtime') as last_run,
3 datetime(next_run_at_epoch, 'unixepoch', 'localtime') as next_run
4FROM agent_schedules
5ORDER BY next_run_at_epoch;
Observation Lifecycle Status
sql
1SELECT status, count(*) as count
2FROM memory_observations
3GROUP BY status;
Active Observations from Planning Sessions
sql
1SELECT substr(observation, 1, 120) as observation, memory_type,
2 context, session_origin_type,
3 datetime(created_at_epoch, 'unixepoch', 'localtime') as created
4FROM memory_observations
5WHERE status = 'active' AND session_origin_type = 'planning'
6ORDER BY created_at_epoch DESC
7LIMIT 20;
Resolution Provenance (what resolved what)
sql
1SELECT m.id, substr(m.observation, 1, 100) as observation,
2 m.resolved_by_session_id, s.title as resolving_session,
3 m.resolved_at
4FROM memory_observations m
5LEFT JOIN sessions s ON m.resolved_by_session_id = s.id
6WHERE m.status = 'resolved'
7ORDER BY m.resolved_at DESC
8LIMIT 10;
Important Notes
- Always use
-readonly flag with sqlite3 to prevent accidental writes
- The database uses WAL mode — safe to read while the daemon is writing
- Epoch timestamps are Unix seconds — use
datetime(col, 'unixepoch', 'localtime') to format
- FTS5 tables (
activities_fts, memories_fts) use MATCH syntax, not LIKE
- JSON columns (
tool_input, files_affected, files_created) can be queried with json_extract()
- Database location:
.oak/ci/activities.db
Automated Analysis
For automated analysis that runs queries and produces reports:
bash
1oak-dev ci agent run usage-report # Cost and token usage trends
2oak-dev ci agent run productivity-report # Session quality and error rates
3oak-dev ci agent run codebase-activity-report # File hotspots and tool patterns
4oak-dev ci agent run prompt-analysis # Prompt quality and recommendations
Reports are written to oak/insights/ (git-tracked, team-shareable).
Deep Dives
Consult these reference documents when the task requires deeper detail:
references/finding-related-code.md — Consult when searching for semantically related code across files or discovering component relationships
references/impact-analysis.md — Consult before refactoring to assess blast radius and identify affected consumers
references/querying-databases.md — Consult when writing custom SQL queries beyond the essential queries above
references/schema.md — Consult for complete CREATE TABLE statements, indexes, and FTS5 table definitions (auto-generated)
references/queries.md — Consult for advanced joins, aggregations, window functions, and debugging queries
references/analysis-playbooks.md — Consult when running structured multi-query workflows for usage, productivity, or activity analysis