Search as Code: Perplexity Is Right About the Future — Just Not First to It

TL;DR

Perplexity Research published a June 1 paper proposing Search as Code, a method for AI agents to write retrieval programs from search primitives instead of calling fixed endpoints. The company reports large accuracy and token savings, but the strongest figures are vendor-run and the approach follows earlier code-agent work.

Perplexity Research published a June 1, 2026 paper arguing that AI agents should generate search programs instead of repeatedly calling fixed search endpoints, a proposal it calls Search as Code. The development matters because agent systems increasingly depend on retrieval across many sources, while the strongest performance figures cited so far come from Perplexity’s own tests.

Perplexity describes Search as Code, or SaC, as a system in which a model acts as the control plane, writes code, and runs that code in a sandbox using an Agentic Search SDK. The SDK exposes search operations such as retrieval, ranking, filtering, fan-out, rendering, deduplication, and extraction as composable primitives.

According to Perplexity, the point is to keep bulk search results out of the model’s context window. Generated code can query many sources, filter and deduplicate records, run schema-bound extraction, and return only selected evidence to the model.

The company reported a CVE case study covering more than 200 high-severity vulnerabilities, each tied to vendor advisories and fix versions. Perplexity said SaC reached 100% accuracy and cut token use by 85%, from 288.7K tokens to 42.9K. It also said rival systems tested in that case scored below 25%. Those are company-reported results, not independent benchmark findings.

AI Dispatch · Infrastructure

Search as Code

Perplexity says agents shouldn’t call a search engine — they should program one, composing atomic primitives into a bespoke pipeline in a sandbox. The thesis is right. It’s also the search-shaped version of an idea the field has been converging on since 2024.

■ The old contract
One fixed pipeline. The model tweaks query params and consumes whatever comes back — through the context window, every time.
model → query(params)
engine → fixed pipeline
return → full result set
repeat ×N serial round-trips
⚠ every intermediate result routed through model context
▲ Search as Code
Amazon

search engine API development kit

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Amazon

search engine API development kit

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Programmable primitives

The model writes code that orchestrates atomic search ops — fan-out, dedupe, verify — keeping bulk data out of the token stream.
sdk.search.web_many(queries)
filter()
dedupe()
sdk.llm.extract_many(schema)
verified records
✓ only the useful tokens reach the model
100%
CVE case-study accuracy (SaC run)
−85%
Token use vs baseline 288.7K → 42.9K
<25%
Score for the rival systems tested
2.5×
SaC lead on Perplexity’s own WANDR bench
A convergent idea, not a cold start
“Let the model write code instead of emitting tool calls” has been building for two years. SaC is the search-specific instantiation.
2024
CodeAct
Wang et al. · ICML
2024–25
smolagents
Hugging Face
2025
Code Mode
Cloudflare
Nov 2025
Code exec + MCP
Anthropic
Jun 2026
Search as Code
Perplexity
The take

Directionally right, genuinely engineered — the rebuilt-from-atoms search stack is the part rivals can’t cheaply copy. But it’s a strong execution of an industry-wide idea, validated mostly on benchmarks Perplexity ran itself. The moat is the infrastructure and the tuning loops, not the architecture.

Sources: Perplexity Research, “Rethinking Search as Code Generation” (Jun 1 2026); CodeAct (Wang et al., ICML 2024); HF smolagents; Cloudflare Code Mode; Anthropic “Code execution with MCP” (Nov 2025). Figures as reported by Perplexity.
thorstenmeyerai.com
Amazon

programmable search primitives SDK

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Programmable Search Enters Agent Workflows

If the pattern works outside Perplexity’s own evaluation, search for AI agents may become less about one answer endpoint and more about execution environments, low-level retrieval tools, verification loops, and cost control. That would move part of the competitive fight from interface quality to infrastructure depth.

For developers and enterprises, the appeal is direct: fewer wasted tokens, more traceable evidence, and retrieval workflows that can be adapted to specific tasks. The security example also shows why verification matters; agents handling vulnerability data need source-bound records, not broad summaries with weak provenance.

Upgraded Pipe Center Finder, Measure Pipe Diameter, Pipefitter Tools With Centering Head, Measure Pipe Diameter Over 0.5" & up With 4" Y-Type Base and Adjustable Dial Bubble Protractor

Upgraded Pipe Center Finder, Measure Pipe Diameter, Pipefitter Tools With Centering Head, Measure Pipe Diameter Over 0.5" & up With 4" Y-Type Base and Adjustable Dial Bubble Protractor

Effortlessly Find Pipe Centers:This tool is perfect for pipefitters and welders to set center lines, determine angles, locate…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Earlier Code Agents Set Precedent

Thorsten Meyer AI’s analysis accepts Perplexity’s core argument but challenges the novelty framing. The piece places SaC in a line of code-driven agent work that includes CodeAct, presented at ICML 2024, Hugging Face’s smolagents work in 2024 and 2025, Cloudflare’s Code Mode in November 2025, and Anthropic work on code execution with MCP cited in the supplied material.

The distinction is scope. Earlier systems focused on letting models write and run code as a general agent mechanism. Perplexity’s claim is narrower and more search-specific: it says the search stack itself has been rebuilt into atomic components that an agent can compose at runtime.

“Rethinking Search as Code Generation”

— Perplexity Research

Adeept 5DOF Robotic Arm Kit Compatible with Arduino IDE, Programmable DIY Coding STEM Educational 5 Axis Build Robot Arm,Robot Starter Kit with OLED Display Processing Code and Tutorials - Wooden

Adeept 5DOF Robotic Arm Kit Compatible with Arduino IDE, Programmable DIY Coding STEM Educational 5 Axis Build Robot Arm,Robot Starter Kit with OLED Display Processing Code and Tutorials – Wooden

【Learn Programming & Robotics】This robotic arm kit is designed for teens to learn coding, building, and programming. Fully…

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Vendor Benchmarks Need Replication

It is not yet clear whether SaC’s reported gains will hold under independent testing, across messy real-world tasks, or against systems tuned specifically for the same benchmarks. Perplexity’s WANDR result also needs outside review because WANDR is described as Perplexity’s own benchmark.

The supplied material does not establish whether SaC is broadly available, how much of the SDK external developers can inspect, or what sandbox controls govern generated code. Those details will shape how much weight buyers and developers place on the architecture.

Independent Tests Will Define Adoption

The next milestone is external validation: public SDK access, reproducible benchmark runs, and comparisons against other code-agent retrieval systems. If outside researchers can reproduce the token savings and accuracy gains, SaC could become an influential pattern for agent search infrastructure. If not, it may be remembered as a strong Perplexity implementation of an idea already moving through the field.

Key Questions

What is Search as Code?

Search as Code is Perplexity’s proposal for letting AI agents write code that assembles search primitives into task-specific retrieval pipelines, rather than calling a fixed search endpoint repeatedly.

Is Perplexity the first company to use this idea?

The search-specific implementation appears to be Perplexity’s, based on the supplied material. The wider idea of models writing code to control tools has earlier examples, including CodeAct, smolagents, Cloudflare Code Mode, and Anthropic’s code-execution work.

How strong are Perplexity’s benchmark claims?

Perplexity reports strong results, including 100% accuracy in a CVE case study and an 85% token-use reduction. The caveat is that these results are vendor-reported and need independent replication.

Who would be affected if SaC works?

Developers building AI agents, search infrastructure providers, security teams, and enterprise buyers could all be affected. The main stakes are retrieval cost, evidence quality, and whether agent search can be controlled more precisely.

What should readers watch next?

Watch for public SDK details, third-party benchmark runs, security reviews of the sandbox model, and whether other search or AI-agent companies adopt similar programmable retrieval layers.

Source: Thorsten Meyer AI

You May Also Like

AI-fueled copper rush spurs Amazon to buy direct from US mine

Amazon makes a rare move to purchase copper directly from a US mine as AI demand drives a copper shortage, marking a shift in supply chain strategies.

OpenAI keeps shuffling its executives in bid to win AI agent battle

OpenAI consolidates leadership roles, with Greg Brockman now leading product strategy amid a shift toward AI agents and potential IPO plans.

Node.js 26.0.0 (Now with Temporal)

Node.js 26.0.0 is now available with the Temporal API enabled by default, alongside updates to V8 14.6 and Undici 8.0. Developers should evaluate new features and deprecations.

The bank account in the chat. How personal finance became an agentic on-ramp.

Exploring how banking in chat interfaces is transforming personal finance into an accessible, agentic experience for users.