📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper emphasizes that in AI-assisted coding, the model itself is only 10% of the system. The focus should be on harness design and context engineering, which have a much larger impact on performance and costs.

A new Google whitepaper published in early 2026 states that the model used in AI coding agents accounts for only about 10% of the overall system behavior. The primary lesson is that harness design and context engineering are the real determinants of performance, cost, and reliability in AI-assisted development, not the size or sophistication of the model itself.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, argues that the prevailing focus on acquiring the latest AI models is misplaced. Instead, the paper emphasizes that 90% of the behavior of an AI agent depends on the harness—the prompts, rules, tools, and observability layers surrounding the model. This is supported by experiments showing that a single team improved their agent’s performance by only changing the harness, with the model remaining constant.

The authors introduce the concept of agentic engineering, where AI is embedded within a structured framework of verification, testing, and guardrails, contrasting with vibe coding, which relies on minimal prompts and quick fixes. They also highlight that costs associated with AI development are driven more by how the harness is built and maintained than by the model’s complexity, with ad-hoc prompting becoming more expensive over time.

At a glance

reportWhen: published early 2026

The developmentThe new Google whitepaper highlights that in AI-driven software development, the model accounts for just 10% of system behavior, shifting focus to harness and context engineering.

The Model Is Only 10% — The New SDLC With Vibe Coding

AI Dispatch · Field Notes

Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

Q: What should organizations do now?

They should evaluate and improve their harnesses—including prompts, tools, and verification processes—and shift focus from model size to system configuration and oversight. Source: ThorstenMeyerAI.com

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified

Vibe Coding

Casual prompts · “does it seem to work?” · disposable code · high risk

Structured AI-Assisted

Detailed prompts + constraints · manual testing · features in real codebases

Agentic Engineering

Formal specs · automated tests + evals + CI gates · production scale · low risk

Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.

The idea worth building your strategy around

Agent = Model + Harness

~10%

HARNESS — prompts · tools · context · hooks · sandboxes · observability

MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S

Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.

“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.

The economics: it’s a token-cost problem (CapEx vs OpEx)

Vibe Coding

Low CapEx · High OpEx

Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.

Agentic Engineering

High CapEx · Low OpEx

Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.

85%

of devs use AI coding agents (51% daily)

41%

of all new code is AI-generated

~90%

of agent behavior is the harness, not the model

+19%

longer on some tasks (METR) — verification is the cost

The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.

thorstenmeyerai.com

Why Harness and Context Engineering Are Game Changers

This shift in focus matters because it redefines where organizations should invest resources for AI development. Instead of chasing newer, larger models, companies should prioritize building robust harnesses and context management systems. This approach can reduce costs, improve reliability, and give organizations a durable competitive advantage, as their custom configurations and frameworks are less likely to be overtaken by model upgrades.

Furthermore, understanding that verification and judgment are the new craft in AI development underscores the importance of human oversight and structured testing, which are critical for deploying AI safely and effectively at scale.

Harness Engineering: Building Reliable AI Agent Systems (The Practical Tech Guide Series)

As an affiliate, we earn on qualifying purchases.

The Evolution of AI Development Practices

Since early 2026, the AI development landscape has been shifting from a focus on model size and raw performance towards system design and configuration. The whitepaper builds on earlier observations that AI adoption is widespread, with 85% of developers using AI coding agents regularly and 41% generating code primarily with AI. Prior to this, emphasis was placed on acquiring the latest models; now, the emphasis is on how those models are integrated and controlled.

This development aligns with broader trends in software engineering, where the emphasis on verification, testing, and structured workflows has increased, especially in safety-critical applications.

“The model is only 10% of what determines behavior; the harness is 90%. Focus on configuration, tools, and context.”
— Addy Osmani

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

As an affiliate, we earn on qualifying purchases.

What Aspects of Harness Design Remain Unclear

It is not yet clear how different industries will adopt these principles at scale or how quickly organizations will shift their investment from models to harnesses. The long-term impact on AI development costs and safety protocols is still being evaluated, and practical implementation guidance is evolving.

Enterprise AI Observability and Monitoring: Monitoring, Governing Production AI Systems Drift Detection, LLM Monitoring, Agentic AI, Governance, and … (Enterprise Machine Learning Operations)

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Development and Adoption

Organizations are expected to begin prioritizing harness development and context engineering in their AI workflows. Future research and industry practices will likely focus on creating standardized frameworks, tools, and best practices for building durable, cost-effective AI systems. Additionally, further empirical studies are anticipated to quantify the benefits of this approach across different sectors.

Monitoring how AI vendors and enterprise teams adapt to this paradigm shift will be crucial in understanding the full impact of the new SDLC framework.

LEAN PROGRAMMING FOR FORMAL SOFTWARE VERIFICATION: Mathematical proof systems and logical frameworks for verified computation

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system behavior?

The whitepaper argues that the model itself provides only the core generation capability, while the surrounding harness—including prompts, rules, tools, and oversight—controls the actual behavior and performance.

How does focusing on harness design reduce costs?

Building a robust harness minimizes unnecessary token usage, reduces maintenance costs, and improves reliability, making AI deployment more cost-effective over time.

What is agentic engineering?

It is an approach that embeds AI within structured workflows, verification, and guardrails, emphasizing systematic configuration over minimal prompting.

Does this mean larger models are obsolete?

Not necessarily. The whitepaper suggests that while larger models offer capabilities, their advantage diminishes unless paired with well-designed harnesses and context management.

What should organizations do now?

They should evaluate and improve their harnesses—including prompts, tools, and verification processes—and shift focus from model size to system configuration and oversight.

Source: ThorstenMeyerAI.com

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Cutrova: Edit the Words, Not the Timeline

Author

TechieUS Team

Share article

The model is only 10%

Why Harness and Context Engineering Are Game Changers

Harness Engineering: Building Reliable AI Agent Systems (The Practical Tech Guide Series)

The Evolution of AI Development Practices

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

What Aspects of Harness Design Remain Unclear

Enterprise AI Observability and Monitoring: Monitoring, Governing Production AI Systems Drift Detection, LLM Monitoring, Agentic AI, Governance, and … (Enterprise Machine Learning Operations)

Next Steps for AI Development and Adoption

LEAN PROGRAMMING FOR FORMAL SOFTWARE VERIFICATION: Mathematical proof systems and logical frameworks for verified computation

Key Questions

Why is the model only 10% of the system behavior?

How does focusing on harness design reduce costs?

What is agentic engineering?

Does this mean larger models are obsolete?

What should organizations do now?

7 Best PC Tablets for Prime Day Deals in 2026

The Future Of AI In Business: SAP’s Focus On System Ownership Over Outsourcing

Portable Laptop Desks: A Back to school Guide

Revolutionize Your Business With These 15 AI Workflow Tools In 2026

Show HN: HN Hall Of Fame – Browse 3,100 Legendary Hacker News Links

ScreenWall – Turn Old Phones Into Synced Widgets For Your Space

Intel Surges In Global Coverage

So Reddit Has Decided That Plain HTML Is Unsafe

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Author

TechieUS Team

Share article

The model is only 10%

Why Harness and Context Engineering Are Game Changers

Harness Engineering: Building Reliable AI Agent Systems (The Practical Tech Guide Series)

The Evolution of AI Development Practices

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

What Aspects of Harness Design Remain Unclear

Enterprise AI Observability and Monitoring: Monitoring, Governing Production AI Systems Drift Detection, LLM Monitoring, Agentic AI, Governance, and … (Enterprise Machine Learning Operations)

Next Steps for AI Development and Adoption

LEAN PROGRAMMING FOR FORMAL SOFTWARE VERIFICATION: Mathematical proof systems and logical frameworks for verified computation

Key Questions

Why is the model only 10% of the system behavior?

How does focusing on harness design reduce costs?

What is agentic engineering?

Does this mean larger models are obsolete?

What should organizations do now?

You May Also Like