I replaced grep-based code exploration with a knowledge graph – 10x less token

(github.com)

4 points | by DeusData 6 hours ago ago

3 comments

DeusData 6 hours ago ago

I built this because AI coding assistants (Claude Code, Cursor, Codex) explore codebases by grepping through files one at a time. Five structural questions about a codebase consumed ~412,000 tokens via file-by-file search.
The same five questions via a knowledge graph query: ~3,400 tokens. That's a 120x reduction — and it's not about fitting in the context window. It's about cost ($3-15/M tokens adds up), latency (graph queries return in <1ms vs seconds of file reading), and accuracy (LLMs lose track of relevant details in large contexts — the "lost in the middle" problem).
It's a single Go binary: tree-sitter parses your code into a SQLite-backed knowledge graph. Functions, call chains, routes, cross-service HTTP links — all queryable via Cypher-like syntax or structured search. You say "Index this project" and then ask things like "what calls ProcessOrder?" or "find dead code."
No Docker, no external databases, no API keys. 35 languages. Auto-syncs when you edit files. Benchmarked against real repos (78 to 49K nodes) including the Linux kernel (20K nodes, 67K edges, zero timeouts).
There are other code graph MCP servers — GitNexus being the biggest (7K+ stars, great visual UI). Key differences: we support 35 languages vs 8-11, ship as a single Go binary (no Node.js/npm), have no embedded LLM (your MCP client IS the intelligence layer — no extra API keys or per-query cost), auto-sync on file changes, and are the only one with published benchmarks across real repos.
GitNexus is great for visual exploration. We're building production tooling.
Happy to discuss architecture, benchmarks, or trade-offs.

[-]
- sturza 5 hours ago ago
  
  Any accuracy measurements?
  
  [-]
  - DeusData 5 hours ago ago
    
    I majorly compared it to the native Explorer agents (for example in claude code). So far it has won against the explorer agents in 98 of 100 cases. I am already in the works to create a bigger benchmark, but did not have so much time for it. But you are welcome to test it out :)