Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

TL;DR

Semble is a code search tool optimized for agents, reducing token usage by 98% compared to grep+read. It indexes repos quickly and integrates with various agents without external dependencies. Its efficiency could significantly improve code search workflows.

Semble, a new code search library built specifically for agents, has been introduced, claiming to use 98% fewer tokens than traditional grep+read methods while providing instant, accurate code snippets.

Semble is designed to enable agents like Claude Code, Codex, and others to perform fast and precise code searches without relying on external services or GPU resources. It indexes repositories in approximately 250 milliseconds and responds to queries in about 1.5 milliseconds, all on CPU. Benchmarks indicate its retrieval quality is comparable to specialized transformer models, with an NDCG@10 score of 0.854.

It can be deployed as an MCP server or used via command-line interface, supporting local paths and git URLs. The library emphasizes token efficiency, returning only relevant code chunks, which results in around 98% fewer tokens used compared to traditional grep+read searches. This efficiency aims to reduce costs and improve performance, especially in large codebases.

Why It Matters

This development matters because it offers a faster, more cost-effective way for AI agents and developers to perform code searches. Reducing token usage can lower computational costs and improve response times, making code search more scalable and accessible for large projects or frequent queries.

Its compatibility with existing agent frameworks and local setup without external dependencies means it can be readily integrated into current workflows, potentially transforming how AI agents access and understand codebases.

FOXWELL NT301 OBD2 Scanner Live Data Professional Mechanic OBDII Diagnostic Code Reader Tool for Check Engine Light

FOXWELL NT301 OBD2 Scanner Live Data Professional Mechanic OBDII Diagnostic Code Reader Tool for Check Engine Light

【Vehicle CEL Doctor】The NT301 obd2 scanner enables you to read DTCs, access to e-missions readiness status, turn off…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background

Traditional code search methods like grep are limited in speed and context-awareness, while transformer-based models, though accurate, are resource-intensive. Semble aims to bridge this gap by combining speed, accuracy, and efficiency. It follows recent trends toward token-efficient retrieval, with benchmarks indicating performance on par with specialized models but at a fraction of the size and cost.

Prior to this, most code search tools either relied on external APIs, GPU resources, or were slow for large repositories. Semble’s approach to local CPU-based indexing and querying marks a notable shift toward lightweight, scalable solutions.

“Semble indexes an average repo in ~250 ms and answers queries in ~1.5 ms, all on CPU, with 98% fewer tokens than grep+read.”

— Semble team

“Our benchmarks show an NDCG@10 of 0.854, comparable to code-specialized transformer models.”

— Benchmarks conducted by Semble team

Inateck 2D Barcode Scanner, Wireless Bluetooth QR Code Scanner with AI APP & SDK, 180-Day Battery Life, Fast & Accurate Scanning, Compatible with iOS/Android/Windows

Inateck 2D Barcode Scanner, Wireless Bluetooth QR Code Scanner with AI APP & SDK, 180-Day Battery Life, Fast & Accurate Scanning, Compatible with iOS/Android/Windows

Powerful Scanning Capability: The Inateck 2D barcode scanner accurately reads almost all 1D and 2D barcodes within a…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

Details about long-term stability, scalability across extremely large codebases, and performance in diverse programming languages are still emerging. It is also unclear how Semble performs in real-world, noisy code repositories or under heavy concurrent usage.

Code by Note, Bk 1: Find the Patterns by Reading the Notes, Coloring Book (Color by Note, Bk 1)

Code by Note, Bk 1: Find the Patterns by Reading the Notes, Coloring Book (Color by Note, Bk 1)

Used Book in Good Condition

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

What’s Next

Next steps include broader adoption, integration into more agent frameworks, and performance benchmarking in varied environments. The Semble team may release updates to improve features, scalability, and user experience based on early feedback.

Looking in the Distance: The Human Search for Meaning

Looking in the Distance: The Human Search for Meaning

Used Book in Good Condition

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

How does Semble compare to traditional grep?

Semble uses approximately 98% fewer tokens and provides faster, more precise code snippets than grep, especially for complex queries or large codebases.

Can Semble search remote repositories?

Yes, Semble can clone and index remote git repositories on demand, enabling searches across multiple remote sources.

Does Semble require external services or GPUs?

No, Semble runs entirely on CPU with no external API keys or GPU dependencies, making it easy to deploy locally.

Which agents or tools can integrate with Semble?

It supports integration with Claude Code, Codex, Cursor, OpenCode, and any MCP-compatible agent, via MCP server or command-line interface.

What are the limitations or uncertainties now?

Performance in very large or complex repositories, long-term stability, and handling of noisy code data are still being evaluated. Further testing is needed for broader adoption.

You May Also Like

Windows Search Broken? Rebuild It the Right Way

For fixing Windows Search issues, learn the right way to rebuild it and ensure your system runs smoothly again.

One Video In, a Whole Publishing Kit Out — Without the Cloud

Discover how to turn a single video into a full publishing kit without relying on the cloud. Save time, protect privacy, and build a professional media presence locally.

Fix Shoulder Pain by Changing Your Armrest Height

Save your shoulder pain by adjusting your armrest height—discover how proper setup can provide lasting relief and improve comfort.

Why Your Phone Says “Moisture Detected” (And What to Do Safely)

If your phone displays a “moisture detected” warning, it means water or…