jackin'
ResearchToken Optimization Research

51 - Code-intelligence tools: codedb, fff, CodeGraff, and alternatives

51 - Code-intelligence tools: codedb, fff, CodeGraff, and alternatives

alternatives re-sweep added the same day. This is a targeted addendum requested after the main dossier: compare codedb, fff, the CodeGraff codedb article, and the CodeGraff product site; then check whether there are stronger alternatives for AI-agent code intelligence and token savings.

TL;DR

  • Yes, this tool class can save tokens, but only when it replaces blind grep/read loops with precise path, symbol, caller, dependency, or function-scope retrieval. It does not magically shrink model reasoning, output, or cache rent.
  • codedb is the strongest candidate for token savings because it returns structured code context: symbols, outlines, callers, dependency graph, compact reads, and codedb_context that combines several lookup steps. Its published token numbers are vendor-side and response-size based, not yet independently reproduced here.
  • fff is primarily a resident file/content search accelerator. It likely saves tokens by reducing dead-end searches and wrong-file reads, but the upstream text publishes latency and qualitative token claims, not a numeric token percentage.
  • CodeGraff is a larger agent/toolchain bet, not just a retrieval tool. The useful token-saving ideas are scope reads, symbol-safe patching, batching, and codedb-backed retrieval. The commercial/agent stack should be evaluated as a role-level opt-in, not silently installed on the host.
  • The strongest alternatives are workload-dependent. For local open-source semantic navigation, Serena is the most serious codedb alternative. For commercial context quality/cost claims, Augment Context Engine is stronger. For enterprise multi-repo code intelligence, Sourcegraph MCP is stronger. For open-source semantic RAG with a published token-reduction claim, Code Context Engine and Claude Context are the clearest candidates.
  • For jackin': pilot inside a role container, measure locally, and keep host effects explicit. The existing the-architect roadmap already proposes fff; codedb deserves an adjacent A/B pilot, and Serena/Claude Context deserve competitor arms if MCP schema overhead is deferred or bounded.

What is being compared

URLEntityCategoryHonest scope
https://github.com/justrach/codedbcodedbOpen source code-intelligence server and MCP toolsetLocal structural index, MCP/HTTP/CLI, remote public-repo queries
https://github.com/dmtrKovalenko/ffffff / fff-mcpOpen source resident file and content search toolkitFast path search and grep with frecency/git metadata
https://codegraff.com/blog/codedb-code-intelligenceCodeGraff article on codedbEvidence and design explainerVendor write-up with latency, byte/token, and workflow examples
https://codegraff.com/CodeGraff / Graff / Pro toolsTerminal coding agent plus optional local file-tool suite and model gatewayBigger workflow replacement; not a drop-in search primitive

CodeGraff and codedb are related: CodeGraff says Graff is powered by codedb, while codedb's repository links back to the CodeGraff article. Treat codedb as the retrieval engine and CodeGraff as the broader agent/product surface around it.

Comparison matrix

AxiscodedbfffCodeGraff
Primary jobCode intelligence for agents: tree, outline, symbols, callers, deps, search, compact reads, snapshots, remote repo queriesFast resident file-name and content searchFull terminal coding agent plus optional Pro local file primitives (muonry, zigrep, zigread, zigpatch)
Agent interface21 MCP tools, HTTP server, CLI, npx launcherMCP tools (ffgrep, fffind, fff-multi-grep), SDKs, Neovim plugingraff CLI/TUI/SDK/gateway, MCP management, Pro MCP tools
Data modelIn-memory structural indexes: outlines, word index, trigram search, dependency graph, content cache, change logWarm file tree/content index, frecency DB, git status annotations, typo/fuzzy matchingAgent loop plus codedb retrieval and local daemon file tools
Best use"Where is this symbol?", "what calls this?", "what depends on this?", "give context for this task", "read only this range/compact view""Find likely files/matches fast; avoid repeated rg process startup and bad rankings"Replace whole-file reads and line-fragile edits with function/symbol-scope reads and patches
Token-savings evidenceExplicit vendor numbers: e.g. README claims structured search results around tens of tokens vs large raw grep dumps; benchmarks page claims 4x fewer bytes in a full edit workflowText says fewer grep roundtrips and token-efficient; no numeric token percentage in README text. Latency and memory claims are specificSite claims "40x leaner", aggregate "tokens saved", structural read 47 vs 2,103 tokens, and 9x byte reduction in article workflow
Evidence tier for tokensT3/T4: specific public numbers, vendor-interested, not locally replicated hereT4: plausible, but numeric token effect unpublished in textT4: product counters and examples, vendor-interested, broad workflow confounders
Quality riskStale/wrong project root, unbounded tree/snapshot dumps, over-trusting fuzzy or structural approximation, telemetry/cache host writesFuzzy false positives, overusing it for semantic questions better answered by LSP/ast-grep/codedbLarger adoption surface, paid/proprietary pieces, possible agent displacement, gateway/provider changes
jackin' fitGood candidate for a role-scoped MCP/CLI pilot, with host-write guardrailsAlready mapped in the-architect roadmap as resident file-search MCPEvaluate only as an explicit alternative agent role/toolchain, not as default core behavior

Alternatives found in the internet re-sweep

This section answers the follow-up question directly: are codedb and fff the best tools for this use case? Not universally.

AlternativeCategoryBetter than codedb/fff when...Worse or riskier when...Token verdict
Serena (oraios/serena)Open-source semantic coding agent toolkit built around language servers and MCPYou need local, language-aware symbol navigation/editing through existing LSPs. It is more semantic than fff and overlaps codedb's strongest local use case.You need codedb's remote public-repo queries, repo snapshots, or one bundled codebase context composer; quality depends on language-server coverage and project setup.Likely token-positive for symbol tasks; no local measurement yet. Best local open-source competitor to codedb.
Code Context Engine (elara-labs/code-context-engine)Local open-source MCP with AST-aware chunks, hybrid vector/BM25 retrieval, graph expansion, compression, and session memoryYou want a local tool whose primary pitch is measurable input-token reduction and one index shared across Claude Code, Codex, Cursor, Gemini CLI, and others.The headline 94% benchmark is against full-file reads; the docs explicitly say real savings against normal Claude Code should be lower. Newer/smaller project than Serena or Sourcegraph.Strongest open-source numeric token-saving claim found, but must be treated as T4/T3 until reproduced against the native harness baseline.
Augment Context EngineCommercial/hosted codebase context engine and agent integrationYou want the strongest published commercial claims for codebase-context quality and cost reduction, and cloud/vendor dependency is acceptable.You need open, local-only, reproducible mechanics inside a jackin' role; model gateway/context engine effects are hard to separate.Vendor claims include large quality gains and lower token bills. Treat as T4 until independently measured.
Sourcegraph MCP / Cody contextEnterprise code search, precise code navigation, SCIP-backed code intelligence, multi-repo contextYou need cross-repo impact analysis, exact definitions/references across large organizations, or existing Sourcegraph infrastructure.You only need a single local repo; setup and service dependency are heavier than codedb/Serena/fff.Strong for reducing wrong-file exploration at enterprise scale; token % not publicly established for this exact use case.
Qodo Context Engine MCPCommercial/enterprise deep-research codebase context engine exposed over MCPYou need multi-repo organization-aware answers, architectural impact analysis, and repository/docs knowledge through a managed context engine.Heavier than local retrieval tools; agentic reasoning can add its own token cost, and token-saving claims are not the main public evidence.More about correctness and broad impact analysis than raw token reduction; evaluate alongside Sourcegraph/Augment, not against fff.
Claude Context (zilliztech/claude-context)Open-source semantic code search MCP / RAG layerYou want vector+keyword semantic retrieval with a published benchmark claiming token reduction, and structural symbol graph is less important.You need exact callers/dependencies or edits by symbol; embedding retrieval can return plausible but wrong context.Public benchmark claims 39.4% token reduction on a code-search task set; needs local reproduction.
CodeGraphContext (CGC)Open-source code knowledge graph / MCPYou want a graph-first repo memory layer and are willing to evaluate a newer system against codedb.Maturity, setup cost, and token evidence are less clear; overlap with codedb is large.Interesting codedb competitor, but not enough public evidence to call it better.
rust-analyzer / language-server MCPsLSP-backed definitions, references, hover/type toolsYou need exact language semantics. For Rust specifically, rust-analyzer is the quality floor, not optional.You need broad repo retrieval, natural-language search, or cross-language summaries.Negative-cost when a definition/reference call replaces grep + multiple reads; schema/tooling overhead must be bounded.
ast-grep / Semgrep MCPStructural pattern search and rewriteYou need syntax-pattern matches or safe mechanical refactors (unwrap, derive blocks, call shapes).You need semantic references/call graph or fuzzy natural-language retrieval.Strong for targeted structural tasks; not a general replacement. Full verdict + skill/MCP/CLI form-factor analysis: ast-grep verdict below.
aider repo-map / repomix / RepoPrompt-style packersContext packaging and repo mapsYou need a compact up-front orientation artifact for a small/medium repo.You are optimizing token spend aggressively; packers can front-load context instead of avoiding reads.Useful baseline/control, not "significantly better" for token saving than bounded retrieval.

Ranking by use case

Use caseBest current candidateWhy
Local open-source semantic navigation for agentsSerena, then codedbSerena leans on language-server semantics; codedb adds a broader indexed tool suite and remote-repo functions. Both beat fff for symbol/caller work.
Local file/path/content search speedfffNarrow but very good fit: warm resident search with frecency/git metadata.
Local task-shaped code context with many MCP primitivescodedbBest bundled open-source package among the original set: outline/symbol/callers/deps/read/context.
Local open-source token-savings benchmarkCode Context Engine, then Claude ContextCCE publishes the largest number but against a generous full-file baseline; Claude Context publishes a smaller semantic-search reduction. Both need local reproduction.
Commercial best-effort "give the agent the right context"Augment Context EngineStrongest vendor-side context-quality and cost claims, but not local/open enough to treat as jackin' default.
Enterprise multi-repo impact analysisSourcegraph MCP or Qodo Context Engine MCPSourcegraph is the precise code-search/code-intelligence infrastructure answer; Qodo is the agentic deep-research/org-knowledge answer.
Semantic RAG with an explicit token-saving benchmarkClaude ContextClearest open-source numeric token-reduction claim found in the sweep, though it is not structural code intelligence.
Rust-specific correctnessrust-analyzer + ast-grep, optionally with codedb/SerenaFor Rust, LSP definitions/references and structural patterns are more trustworthy than fuzzy search.

Practical answer: codedb and fff are not "the best two" as a pair. A stronger local jackin' stack would be:

rust-analyzer / ast-grep for exact semantic + structural facts
+ codedb or Serena for task-shaped code intelligence
+ fff for fast file/content search
+ native file-read/edit tools for final spans and patches

If commercial/cloud tooling is acceptable, Augment, Sourcegraph, and Qodo are the serious challengers to test. If the requirement is open-source token reduction with an explicit benchmark, add Code Context Engine and Claude Context to the A/B.

Token economics

The relevant equation is:

net_saved =
 avoided failed searches
+ avoided wrong-file reads
+ avoided whole-file reads
+ avoided follow-up calls
- MCP schema/tool-description rent
- oversized indexed-tool outputs
- index/setup prompts and stale-index recovery

These tools are valuable when the first four terms dominate. They lose when the agent simply adds their output on top of normal rg, cat, and file reads.

codedb verdict

Likely token-positive when used correctly; not yet proven locally.

codedb's upstream claims are stronger than fff's on token economics. The README describes a context engine for agents and lists MCP tools for tree, outline, symbol, search, word lookup, callers, dependency graph, compact reads, changes, status, snapshots, local projects, remote public repos, and a codedb_context composer. It also publishes token-efficiency examples where structured results are orders of magnitude smaller than raw grep output, plus a benchmark page claiming a full edit workflow drops from roughly 50 KB to roughly 12 KB.

The practical mechanism is sound:

  • codedb_word or codedb_symbol can replace broad grep for exact identifiers.
  • codedb_callers can replace "grep symbol, read candidates, infer scope".
  • codedb_deps can replace ad hoc import greps.
  • codedb_outline can orient on a file before reading the whole file.
  • codedb_read with line ranges or compact mode can avoid full-file dumps.

Measured locally — the honest size of the lever. tools/count_tokens.py on three real repo files (18,613 tokens if read whole):

FileLinesRead whole (tok)Outline (tok)One symbol-search (tok)Outline cutSearch cut
mount_info.rs2894,4043238293%98%
dialog_widgets.rs4155,5981898997%98%
update.rs6478,6111,14120987%98%
Total18,6131,65338091%98%

An outline (signatures + line numbers, what codedb_outline returns) costs 91% fewer tokens than reading the file; a targeted symbol-search result (what codedb_search / ffgrep return) 98% fewer. T1, locally reproduced — this is why the lever is real, and why the vendor multipliers (1,628× / 40×) are per-query best-case against a whole-file-dump baseline a disciplined agent already avoids.

  • codedb_context can collapse 3-5 serial location calls into one task-shaped response when the query is broad enough.

The caveat is output discipline. codedb_tree, codedb_snapshot, and remote tree queries can be large. The CodeGraff hooks lab explicitly shows a guard for unbounded codedb_remote action=tree calls. That is the right instinct: code intelligence saves tokens only if calls return bounded, task-shaped context.

Local adoption verdict: Add codedb to the validation harness as an A/B arm against native rg/Read and against fff. Do not count vendor token multipliers as banked savings until measured on jackin' tasks.

fff verdict

High-confidence latency win; token savings are plausible but unquantified.

fff is narrower than codedb. It is a resident Rust search core exposed through MCP, SDKs, and Neovim. It keeps a file/content index warm, adds git/frecency metadata, supports typo/fuzzy fallback, and exposes agent-facing tools such as ffgrep and fffind. The README's strongest concrete data is latency and memory: on very large repos, repeated warm queries are positioned as sub-10 ms instead of repeated multi-second ripgrep process spawns, with about 26 MB resident memory on a 14k-file repo and roughly 360 bytes per indexed file for the content index.

Token savings can happen in three ways:

  • fewer zero-result grep calls because fuzzy fallback finds likely variants;
  • fewer wrong-file reads because frecency/git status rank active files higher;
  • smaller result sets because weak-match detection prevents fuzzy noise from flooding the context.

But the upstream text does not provide a numeric token percentage in normal text. The prior dossier therefore correctly kept fff at T1 for latency and T4 for tokens. That verdict still stands after the re-check.

Local adoption verdict: Keep the existing the-architect fff pilot, but require equal target-file hit rate and measured tool-result-token reduction before treating fff as a token-saving lever.

CodeGraff verdict

Potentially token-positive as a workflow replacement; too broad to treat as a single retrieval optimization.

CodeGraff's product page says Graff is a terminal coding agent powered by codedb. Its Pro surface adds local file tools that return enclosing functions, structural outlines, symbol reads, symbol-safe patches, and batch operations through a persistent daemon. Those ideas attack a real waste pattern in coding agents: reading whole files, grepping without scope, editing by fragile line numbers, then re-reading entire files to verify.

The promising primitives are:

  • "scope mode": return the enclosing function/block, not a naked match line;
  • structural read: outline first, then symbol/function body instead of whole file;
  • patch by symbol: reduce line drift and verification reads;
  • batch operations: amortize per-call overhead and keep intermediate plumbing out of the chat transcript;
  • codedb as a retrieval layer before action.

The claims are vendor-side and confounded by the full agent loop. The product site mentions aggregate tokens saved, per-op token counts, and last-30-day counters, but those are not independently auditable from the page. The codedb article's 9x byte workflow example is more concrete but still a selected vendor example.

Local adoption verdict: Do not fold CodeGraff into jackin' core. If useful, model it as an explicit role/toolchain experiment. Extract the technique pattern first: function-scope read, bounded search, diff-returning edits, and batch tools.

ast-grep verdict (structural search as an agent skill)

ast-grep appears in the recommended local stack above and in the row table, but it was never given a full verdict. This section closes that gap (added 2026-06-18, prompted by the operator's request to research the official ast-grep/agent-skill package).

What it is. ast-grep (ast-grep/ast-grep, 14,617★, MIT, created 2022 — a mature, monthly-released project; gh api 2026-06-18) is a Rust CLI for structural code search, lint, and rewrite: it matches tree-sitter AST patterns ($X.unwrap(), async fn without error handling, a specific call shape) and can rewrite them as codemods across thousands of files, deterministically, never matching inside comments or string literals. There are three ways an agent reaches it:

FormWhat it isStanding token costTrade-off
CLI (ast-grep on PATH)The binary, called from a shellZero schema rent; cross-runtimeThe agent must know the flags / be told to use it
Agent skill (ast-grep/agent-skill, 750★, Claude-only, created 2025-11; last pushed 2026-01, ~5 months stale)A Claude Code Skill teaching a verify-before-search loop (write rule → test on examples → search)Near-zero — only the skill's frontmatter is advertised; full SKILL.md loads on triggerClaude "cannot automatically detect when to use ast-grep" (the repo's own admission) — the user must name it; simple prompting "requires a model with up-to-date knowledge of ast-grep"
MCP (ast-grep/ast-grep-mcp, 423★, explicitly experimental)4 tools: dump_syntax_tree, test_match_code_rule, find_code, find_code_by_ruleSchema rent every turn, used or not, plus uv/MCP setupDeterministic structured rule-refinement loop; no reliance on the model's trained ast-grep knowledge

Token economics — real mechanism, evidence by category not by ast-grep. The saving is the same lever as codedb's symbol search: a structural match returns only the nodes that match, with file:line, so the agent skips the grep → read-whole-file → confirm-context loop that burns input tokens. The mechanism is sound and directionally well-supported, but there is no ast-grep-specific A/B token benchmark anywhere. The headline numbers that circulate — ~32× (AddObservation 31,530 → 972 tokens), "3–50× fewer tokens per response", an "average 6×" output compression — all come from independent analysts measuring graph/symbol tools (Gortex, LSP-backed) or synthesizing the CoREB/SWEzze papers, not from ast-grep itself. Tellingly, ast-grep's own materials (ast-grep.github.io, the AI-rules blog) make zero token claims — they argue reliability of AI-generated rules via test-before-search, not economics. Evidence tier: T4 for any ast-grep-specific token percentage (none published); T2/T3 for the category mechanism (structural retrieval beats grep+read — the same lever this file reproduces locally at −98% for symbol search). The honest read: ast-grep saves tokens for the same structural reason codedb does, and the magnitude is plausibly in the same band, but no one has measured ast-grep specifically — treat the 32×/6× figures as cousins' numbers, not its own.

Niche — and the hard limit. ast-grep owns the structural modality in the lexical → structural → graph escalation: it is the deterministic middle that needs no language server, ideal for syntax-shape queries and especially codemods/rewrites. It cannot resolve names across files — it cannot tell which parseConfig a call refers to. Cross-file symbol resolution is LSP / rust-analyzer / Serena territory (the graph modality); fuzzy/conceptual queries are embedding/semantic territory. So ast-grep is a complement to, not a replacement for, codedb's caller/dependency graph and rust-analyzer's compiler truth — exactly the routing the recommended stack and the architect roadmap already encode.

jackin' verdict. For a multi-runtime role, the CLI form on PATH plus a shared guidance file is the right default — zero schema rent, every runtime reaches it, no Claude-only lock-in — which is precisely what the the-architect pilot chose (the Claude-only skill is deliberately not in that pilot, and the experimental MCP's per-turn rent is not justified when the CLI is free). The skill is worth knowing as the lowest-effort Claude-only on-ramp, but its explicit-invocation gating and five-month staleness make it a weak default. Decision record and routing: docs/content/docs/reference/roadmap/architect-code-intelligence-tooling.mdx.

General rules for all three

  1. Teach the agent the retrieval contract. The instruction should be explicit: use indexed tools to locate paths/symbols first, then read the smallest exact span needed. Do not dump full trees, full snapshots, or whole files unless the task requires them.
  2. Bound every broad query. Cap result counts, prefer prefixes, and ask for path:line plus symbol names over raw line dumps.
  3. Use the right tool class. Plain text and literal strings can stay with rg. File discovery can use fff. Symbol/caller/dependency questions should use codedb, rust-analyzer, or ast-grep where available.
  4. Keep MCP schema overhead under control. If the client supports tool search or schema deferral, use it. If not, consider CLI/HTTP wrappers for rarely-used tools and expose only the hot retrieval calls over MCP.
  5. Smoke-test freshness at session start. Run a status/index-ready check before relying on results, especially after large code generation or checkout changes.
  6. Make host effects explicit. The upstream installers may write ~/.codedb, ~/.claude.json, ~/.codex/config.toml, or other client config. In jackin', install and register inside the role container unless the operator explicitly opts into host changes.

codedb setup

Recommended agent instruction:

Use codedb for code navigation before broad text search:
- Start unfamiliar tasks with `codedb_context` when the question spans multiple files.
- Use `codedb_symbol` or `codedb_word` for exact identifiers.
- Use `codedb_callers` before refactoring or changing public behavior.
- Use `codedb_deps` for import/dependency impact.
- Use `codedb_outline` before reading a large file.
- Prefer bounded `codedb_read` ranges or compact reads; avoid full snapshots and
 unbounded trees unless explicitly needed.
- Prefer the client's native edit tool; `codedb_edit` is fallback only.

Setup shape:

  • Install inside the agent environment, not the host, for jackin' roles.
  • Disable telemetry if the environment requires it: CODEDB_NO_TELEMETRY=1.
  • Register MCP at user scope inside the container, for example: claude mcp add codedb -s user -- /usr/local/bin/codedb mcp or codex mcp add codedb -- /usr/local/bin/codedb mcp.
  • Verify with codedb --version and codedb status or codedb_status.
  • Ensure root resolution points at the mounted workspace, not ~ or a system directory. Pass the project argument when a client does not supply MCP roots.
  • Add a guard hook for remote tree calls: require expand=false, a prefix, or a limit before allowing large codedb_remote action=tree responses.

fff setup

Recommended agent instruction:

For file-name search and grep in the current git-indexed project, use fff before
falling back to repeated shell `rg` calls. Use `fffind` for paths and `ffgrep` for
content. Keep result sets small, refine weak/fuzzy matches, then read exact file
spans with the normal file-read tool.

Setup shape:

  • Install fff-mcp inside the role image or setup hook.
  • Register at user scope in the container: claude mcp add -s user fff -- fff-mcp.
  • Keep the existing the-architect pilot requirement: fff must measurably beat native ripgrep on this repo or be dropped.
  • Do not use fff as a semantic engine. It finds likely files and lines; it does not replace rust-analyzer, ast-grep, or codedb's caller/dependency graph.

CodeGraff setup

Recommended agent instruction, if evaluating the full toolchain:

Use CodeGraff/Graff local file tools to avoid whole-file reads:
- Search in scope mode when changing a function or method.
- Outline before reading a large file.
- Read by symbol/function when possible.
- Patch by symbol when line drift is likely.
- Batch related reads/searches/diffs when the daemon supports it.

Setup shape:

  • Treat CodeGraff as an explicit agent/toolchain role, not as a transparent dependency of jackin' core.
  • Install only in the container or on an operator-approved host path.
  • Separate the free/open graff/codedb path from paid Pro tooling and the CodeGraff model gateway. Measure each independently.
  • If using the gateway, keep model-routing/cache effects separate from local file-tool savings; otherwise the token analysis becomes impossible to attribute.

jackin' adoption recommendation

The existing roadmap page docs/content/docs/reference/roadmap/architect-code-intelligence-tooling.mdx already has the right shape for fff: role-scoped, opt-in, no jackin-core change, registered in the container's user scope, and accepted only if it beats ripgrep net of MCP overhead.

Extend that experiment rather than generalizing immediately:

  1. Keep the planned fff A/B.
  2. Add a codedb A/B arm if the role can carry another MCP server without always-on schema cost.
  3. Do not add CodeGraff Pro by default. If evaluated, make it a separate "agent stack replacement" arm.
  4. Use the same task suite and metrics for all arms.
  5. Promote only primitives that show equal-or-better target-file hit rate and lower total tokens per solved task.

Validation harness

Run 20-30 fixed repository-navigation tasks in at least four arms, expanding to the alternatives when they are installable in the test environment:

ArmTools allowed
NativeShell rg, find, normal file reads/edits
ffffff MCP plus normal file reads/edits
codedbcodedb MCP plus normal file reads/edits
SerenaSerena MCP plus normal file reads/edits
Code Context EngineCCE MCP plus normal file reads/edits
Claude ContextClaude Context MCP plus normal file reads/edits
SourcegraphSourcegraph MCP or Sourcegraph local/enterprise context tools
QodoQodo Context Engine MCP, if explicitly approved
AugmentAugment Context Engine / Augment agent context, if explicitly approved
CodeGraffGraff/Pro local file tools, if explicitly installed

Task categories:

  • exact symbol definition lookup;
  • fuzzy remembered phrase or path;
  • caller/reference discovery;
  • reverse dependency/impact analysis;
  • large-file function edit;
  • wrong-query/zero-result recovery;
  • recently modified file discovery;
  • public dependency or remote repo lookup, if testing codedb remote.

Metrics:

  • target-file hit rate;
  • top-1 target hit rate;
  • turns to first correct file;
  • tool calls;
  • tool-result tokens;
  • total input/output/cache tokens from the session ledger;
  • wall-clock latency;
  • edit correctness and test result;
  • stale-index or wrong-root incidents;
  • MCP schema tokens loaded at turn start, if the client exposes them.

Acceptance rule:

Accept a tool for token optimization only if:
 target-file hit rate >= native
 edit/test success >= native
 total tokens per solved task <= native by at least 20-30%
 no unbounded output path remains

Latency alone is not enough. A tool can be much faster and still neutral or negative on tokens if the agent reads the same files afterward.

Failure modes and guardrails

  • Wrong project root: status looks ready, but it indexed ~ or another repo. Guard: status check plus explicit project root.
  • Unbounded tree/snapshot: the index dumps more than native tools would. Guard: hooks or instructions requiring limit, prefix, compact mode, or line ranges.
  • Schema rent: 20+ MCP tools are loaded into every turn without deferral. Guard: tool-search/schema-deferral, CLI fallback, or narrower MCP exposure.
  • Fuzzy confidence error: search returns plausible but wrong files. Guard: target-file benchmark, weak-match refinement, require exact verification before editing.
  • Stale index after edits: agent trusts old symbol/caller data. Guard: status/changes checks and re-run query after large rewrites.
  • Host mutation: installers auto-register client configs. Guard: container-only install or explicit operator opt-in surfaced in launch summary.

Bottom line

For AI-agent work, these tools are best understood as retrieval precision and observation-shaping tools, not compression tools. They save tokens when they help the agent look at fewer, better spans of code.

  • codedb: strongest candidate from the original set; pilot it against native search with a strict bounded-output policy.
  • fff: keep as the low-risk resident search pilot; expect latency wins first, token wins only if wrong-file/dead-end calls drop.
  • CodeGraff: valuable ideas, larger adoption blast radius; evaluate as an explicit role or workflow replacement, not as a hidden dependency.
  • Serena: best local open-source semantic-navigation alternative found in the re-sweep; add it to the A/B if language-server setup is acceptable.
  • Code Context Engine: strongest local open-source token-savings claim found; benchmark against native Claude/Codex behavior before trusting the 94% headline.
  • Augment / Sourcegraph / Qodo: stronger commercial or enterprise answers for broad codebase context, but too heavy/vendor-bound for a default jackin' role.
  • Claude Context: best open-source alternative found with a public numeric token-reduction claim; evaluate for semantic-RAG tasks, not exact refactors.

Vector databases (Qdrant) — a separate question, answered in file 52

fff (lexical) + codedb (structural) cover the two retrieval paradigms that move code tokens. A vector database (semantic) is a third paradigm and a different job: it is not a token-saving layer for code navigation — neutral-to-loss once chunk over-fetch, index staleness, and the embedding-model cost are counted, and the flagship agents (Claude Code, Cline, Aider, Sourcegraph) dropped code embeddings for exactly those reasons. It earns its place only as a separate semantic layer for non-code knowledge (docs/tickets/decisions) and cross-session memory — and even there it is a wall-clock / tool-loop speedup more than a guaranteed token cut.

The full treatment — Qdrant vs alternatives (Milvus, LanceDB, pgvector, Turbopuffer, …); the benchmark evidence (no engine decisively beats Qdrant, "best" is workload-dependent); the docs-corpus case; token economics; a jackin' container setup; and a runnable validation protocol — is in 52-qdrant-and-vector-databases.md.

Source ledger

unless noted.

Vector/semantic layer:

On this page