Agentifind: Codebase Intelligence Setup
This skill sets up codebase intelligence by:
- Running agentifind CLI to extract code structure
- Detecting dynamic patterns that static analysis can't fully trace
- Synthesizing a navigation guide with staleness metadata
Procedure
Step 1: Check for existing guide (Staleness Detection)
If .claude/CODEBASE.md already exists, check if it's stale:
-
Read the metadata header from CODEBASE.md:
Source-Hash: {sha256 of codebase.json when guide was generated}
Commit: {git commit when generated}
Stats: {file count, function count, class count}
-
Compare against current state:
- Run
sha256sum .claude/codebase.json (or equivalent)
- Run
git rev-parse HEAD
- Read current stats from codebase.json
-
If metadata matches: Guide is fresh. Ask user if they want to regenerate anyway.
-
If metadata differs or missing: Guide is stale. Proceed with regeneration.
-
If no CODEBASE.md exists: Proceed with generation.
Step 2: Detect repo type and install LSP (if needed)
Check if this is a Terraform/IaC repository:
bash
1# Check for .tf files
2find . -name "*.tf" -type f | head -1
If Terraform files are found:
Check if terraform-ls is installed. If not, install it for better parsing accuracy:
bash
1# Check if terraform-ls exists
2which terraform-ls || echo "NOT_INSTALLED"
If NOT_INSTALLED, install terraform-ls:
bash
1# macOS (Homebrew)
2brew install hashicorp/tap/terraform-ls
3
4# Or via Go (cross-platform)
5go install github.com/hashicorp/terraform-ls@latest
Why terraform-ls matters:
- Proper HCL parsing (not regex)
- Accurate module resolution
- Cross-file reference tracking
- Provider schema awareness
If installation fails, agentifind will fall back to regex parsing (still functional but less accurate).
Step 3: Run agentifind sync
Execute the CLI to extract code structure:
bash
1npx agentifind@latest sync
Extraction Method:
- LSP first (if available): Uses language servers for accurate cross-file resolution
- Python:
pyright-langserver (install: npm i -g pyright)
- TypeScript:
tsserver (bundled with TypeScript)
- Terraform:
terraform-ls (install: brew install hashicorp/tap/terraform-ls)
- Note: LSP extraction can take 5-15 minutes on large codebases (building reference graph)
- Regex/Tree-sitter fallback: Fast parsing when LSP unavailable (~30 seconds)
This creates .claude/codebase.json with:
- Module imports/exports
- Function and class definitions
- Call graph relationships (more accurate with LSP)
- Import dependencies
Options:
--skip-validate: Skip linting/type checks (faster)
--verbose: Show extraction method and progress
--if-stale: Only sync if source files changed
Step 4: Update .gitignore
Add the generated files to .gitignore (if not already present):
# Agentifind generated files
.claude/codebase.json
.claude/CODEBASE.md
.claude/.agentifind-checksum
These files are:
- Regeneratable from source code
- Large (codebase.json can be several MB)
- Machine-specific (paths may differ)
Step 5: Read and analyze extracted data
Read .claude/codebase.json and analyze:
stats: File/function/class counts
modules: Per-file structure (imports, exports, classes, functions)
call_graph: What functions call what
import_graph: Module dependencies
analysis_gaps: Gaps in call graph (see Step 6)
validation: Lint/type issues (if present)
Step 6: Review analysis gaps
The CLI automatically detects gaps in the call graph that may indicate dynamic patterns. Read analysis_gaps from codebase.json:
json
1{
2 "analysis_gaps": {
3 "uncalled_exports": [...], // Exported functions with no callers
4 "unused_imports": [...], // Imports never referenced
5 "orphan_modules": [...] // Files never imported
6 }
7}
How to interpret gaps:
| Gap Type | What It Means | Likely Cause |
|---|
uncalled_exports | Exported function has no detected callers | Entry point, CLI command, API handler, test fixture, plugin hook, signal receiver, decorator-invoked |
unused_imports | Import never referenced in code | Side-effect import, re-export, type-only import, dynamically accessed |
orphan_modules | File never imported by anything | Entry point, script, config file, dynamically loaded plugin |
Key insight: If something is exported but never called, or imported but never used, static analysis cannot trace it. These are the areas where the call graph is incomplete.
No manual scanning required - the CLI does this automatically by analyzing the call graph structure.
Step 7: Identify key components
From the data, determine:
- Entry points: Files with many importers (check import_graph reverse)
- Core modules: High export count, central in import graph
- Utilities: Imported by many, import few themselves
- Request flow: Trace call_graph from entry to output
Step 8: Write CODEBASE.md
First, check repo_type in codebase.json:
- If
repo_type is "terraform" → Use the Infrastructure Template below
- If
repo_type is missing or other → Use the Application Template below
Application Template (default)
Create .claude/CODEBASE.md with this structure:
markdown
1# Codebase Guide
2
3<!-- STALENESS METADATA - DO NOT EDIT -->
4<!--
5Generated: {ISO 8601 timestamp}
6Source-Hash: {sha256 of codebase.json}
7Commit: {git commit hash}
8Stats: {files} files, {functions} functions, {classes} classes
9-->
10
11## ⚠️ Usage Instructions
12
13This guide provides STARTING POINTS, not absolute truth.
14
15**Before acting on any location:**
161. Verify the file exists with a quick Read
172. Confirm the symbol/function is still there
183. If something seems wrong, the guide may be stale - regenerate with `/agentifind`
19
20**This guide CANNOT see:**
21- Runtime behavior (dynamic imports, plugins, DI)
22- Configuration-driven logic
23- Database queries and their relationships
24- External API integrations
25
26## Quick Reference
27
2829|-----------|----------|
30| {name} | `{path}` → `{symbol}` |
31
32## Architecture
33
34### Module Dependencies
35{Key relationships from import_graph - focus on core modules}
36
37### Data Flow
38{Trace from call_graph if clear pattern exists}
39
40## Analysis Gaps (Potential Dynamic Patterns)
41
42{If analysis_gaps has items, list them here grouped by type}
43
44### Uncalled Exports
45{List from analysis_gaps.uncalled_exports - these are likely entry points, API handlers, or dynamically invoked}
46
4748|--------|------|--------|
49| {name} | `{file}:{line}` | {reason} |
50
51### Orphan Modules
52{List from analysis_gaps.orphan_modules - these are likely entry points or dynamically loaded}
53
5455|------|--------|
56| `{file}` | {reason} |
57
58**What this means:**
59- Call graph is incomplete for these symbols/files
60- They may be invoked via plugins, signals, decorators, CLI, or configuration
61- Always trace execution manually when working in these areas
62- Don't assume the call graph shows all callers
63
64{If no gaps found, write: "No analysis gaps detected. Call graph appears complete."}
65
66## Conventions
67{Infer from naming patterns, file organization, directory structure}
68
69## Impact Map
70
7172|------------------|----------------|
73| `{high-dependency file}` | {N} dependent files |
74
75## Known Issues
76{From validation.linting/formatting/types if present, otherwise omit section}
When repo_type is "terraform", create .claude/CODEBASE.md with this structure:
markdown
1# Infrastructure Guide
2
3<!-- STALENESS METADATA - DO NOT EDIT -->
4<!--
5Generated: {ISO 8601 timestamp}
6Source-Hash: {sha256 of codebase.json}
7Commit: {git commit hash}
8Stats: {files} files, {resources} resources, {modules} modules
9-->
10
11## ⚠️ Usage Instructions
12
13This guide provides STARTING POINTS for infrastructure navigation.
14
15**Before making changes:**
161. Verify the resource/module exists
172. Check the blast radius (what depends on this?)
183. Review variable dependencies
194. Consider state implications
20
21**This guide CANNOT see:**
22- Remote state data
23- Dynamic values from data sources
24- Provider-specific behaviors
25- Secrets in tfvars files
26
27## Infrastructure Overview
28
2930|----------|-----------|---------|
31{For each provider in stats.providers, count resources}
32
33## Module Structure
34
35{List from modules array, show source and dependencies}
36
modules/
├── {module.name}/ → {module.source}
│ └── inputs: {list key variables}
## Resource Inventory
{Group resources by type from resources object}
### {Provider} Resources
| Type | Name | File | Dependencies |
|------|------|------|--------------|
| {type} | {name} | `{file}:{line}` | {dependencies.length} deps |
## Variable Flow
{List from variables array}
| Variable | Type | Used By | Default |
|----------|------|---------|---------|
| {name} | {type} | {used_by.length} resources | {default or "required"} |
## Blast Radius (High Risk)
{List from blast_radius where severity is "high" or "medium"}
⚠️ **Changing these resources affects many dependents:**
| Resource | Affected | Severity |
|----------|----------|----------|
| {target} | {affected_resources.length} resources | {severity} |
**Before modifying high-risk resources:**
- Run `terraform plan` to preview changes
- Consider using `terraform state mv` for refactoring
- Check if changes will force recreation
## Outputs
{List from outputs array}
| Output | Value | Referenced |
|--------|-------|------------|
| {name} | {value} | {references} |
## Dependency Graph
{Describe key relationships from dependency_graph}
Key dependencies:
- `{resource A}` → depends on → `{resource B}`
Step 9: Confirm completion
For application repos, report:
- Files analyzed (from stats.files)
- Symbols extracted (from stats.functions + stats.classes)
- Extraction method used (LSP or tree-sitter)
- Key entry points identified
- Analysis gaps detected (count of uncalled_exports, orphan_modules)
- Any validation issues found
- Guide staleness metadata recorded
For Terraform/IaC repos, report:
- Files analyzed (from stats.files)
- Resources extracted (from stats.resources)
- Modules detected (from stats.modules)
- Providers used (from stats.providers)
- High-risk resources (count from blast_radius with severity "high")
- Variables defined vs used
- Guide staleness metadata recorded
Step 10: Offer to update agent instructions
Check if CLAUDE.md or AGENTS.md exists in the project root.
Ask the user:
"Would you like me to add an instruction to your {CLAUDE.md/AGENTS.md} file so the agent automatically uses the CODEBASE.md for navigation?"
If user accepts:
Append this section to the file (or create AGENTS.md if neither exists):
markdown
1## Codebase Navigation
2
3Before exploring the codebase, read `.claude/CODEBASE.md` for architecture overview, key files, and conventions. This file is auto-generated by agentifind and provides:
4- Quick reference to key components
5- Module dependencies and data flow
6- Dynamic patterns that static analysis can't trace
7- Coding conventions
8- Impact map for changes
9
10**Important:** The guide provides starting points. Always verify locations before making changes.
If both CLAUDE.md and AGENTS.md exist, update CLAUDE.md (takes precedence).
If user declines:
Respond with:
"No problem! If you change your mind, add this to your CLAUDE.md or AGENTS.md file:"
markdown
1## Codebase Navigation
2
3Before exploring the codebase, read `.claude/CODEBASE.md` for architecture overview, key files, and conventions.
Output
.claude/
├── codebase.json # Structured extraction (CLI output)
├── CODEBASE.md # Navigation guide (this skill's output)
└── .agentifind-checksum # Staleness detection
Notes
- Ground ALL claims in the extracted data - do not hallucinate relationships
- Keep the guide concise - focus on navigation over explanation
- Prioritize "what" and "where" over "why" and "how"
- If codebase.json already exists and is recent, skip Step 2
- LSP extraction is slower but more accurate for cross-file references
- Tree-sitter is faster but uses heuristic-based resolution
- Always include the staleness metadata header - it enables future freshness checks
- Always include the analysis gaps section - even if none found, document that the call graph is complete