TL;DR
Semble is a code search tool designed for agents that reduces token usage by approximately 98% compared to traditional grep-based methods. It offers rapid, accurate code retrieval on CPU without external dependencies. This development could significantly improve code search efficiency for AI agents and developers.
Semble, a new code search library optimized for agent environments, has been introduced, promising to cut token usage by around 98% compared to traditional grep methods while maintaining high retrieval accuracy. This innovation allows agents to access code snippets faster and more efficiently, with potential impacts on AI development and developer workflows.
Semble is designed to enable agents like Claude, Codex, Cursor, and OpenCode to search codebases instantly without relying on external APIs, GPUs, or cloud services. It indexes repositories in approximately 250 milliseconds and responds to queries in about 1.5 milliseconds, all on CPU hardware.
The library achieves this by returning only the relevant code segments, significantly reducing token usage—about 98% fewer tokens than traditional grep-based searches that read entire files. Benchmarks indicate Semble’s retrieval quality is comparable to code-specialized transformer models, with an NDCG@10 score of 0.854.
Semble can be integrated as a local server via MCP (Meta Code Protocol), allowing seamless use with popular agents and tools. It supports searching local repositories or remote git URLs, with automatic re-indexing on file changes. Installation requires only the Python package ‘semble’ or the ‘uv’ tool, with no need for API keys or GPU resources.
Why It Matters
This development matters because it addresses a key bottleneck in AI-assisted coding: efficient, accurate code retrieval at low token costs. By drastically reducing token consumption, Semble enables faster, more cost-effective code searches, which can improve developer productivity and the performance of AI agents relying on code context. It also opens the door for more scalable and accessible code search solutions, especially for large codebases or resource-constrained environments.

FOXWELL NT301 OBD2 Scanner Live Data Professional Mechanic OBDII Diagnostic Code Reader Tool for Check Engine Light
【Vehicle CEL Doctor】The NT301 obd2 scanner enables you to read DTCs, access to e-missions readiness status, turn off…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background
Traditional code search methods, such as grep, read entire files, leading to high token usage and latency. Recent advances in transformer-based models have improved accuracy but at significant computational and token costs. Semble builds on the need for a lightweight, fast, and accurate code search tool that can operate locally without external dependencies. Its release follows ongoing efforts to optimize AI tooling for developer workflows and codebase management.
“Semble returns only the relevant chunks, using ~98% fewer tokens than grep+read, while maintaining high retrieval accuracy.”
— Semble team
“It’s impressive how fast and token-efficient Semble is, especially since it runs entirely on CPU without external services.”
— Hacker News user

ResumeMaker Professional Deluxe 20 – Software to Create Professional Resumes Includes Sample Resumes Written by Certified Resume Writers, Career Advice, Job Searches & Interview Questions – CD – PC
Works on Windows 11, 10, & 8
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What Remains Unclear
It is not yet clear how Semble performs on extremely large or complex codebases beyond benchmark tests, or how it compares in real-world developer workflows over extended periods. Further user testing and adoption data are pending.
AI code snippet search tool
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What’s Next
Next steps include wider adoption and testing of Semble across various codebases and agent integrations. Developers and organizations will likely evaluate its performance in real-world scenarios, potentially leading to updates or enhancements. Monitoring user feedback and benchmarking results will be key to assessing its long-term impact.

Web Scraping with Python: Data Extraction from the Modern Web
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
How does Semble achieve such a high reduction in token usage?
Semble returns only the relevant code chunks needed for a query, avoiding reading entire files, which drastically reduces token count—about 98% fewer tokens than grep+read.
Can Semble replace traditional code search tools?
While Semble is optimized for agent use and token efficiency, it can complement or replace grep in many scenarios, especially where speed and token economy are priorities.
Is Semble easy to integrate with existing AI agents?
Yes, Semble supports integration via MCP or CLI, and instructions are provided for popular agents like Claude Code, Codex, Cursor, and OpenCode.
Does Semble require external services or GPUs?
No, it runs entirely on CPU, with no API keys or external dependencies needed, making it easy to deploy locally.
What are the limitations or open questions about Semble?
Its performance on very large or complex codebases and long-term reliability in diverse workflows remain to be fully tested and validated.