The last six months in LLMs in five minutes

TL;DR

Over the past six months, the landscape of large language models has seen rapid shifts in top-performing models, significant advances in coding AI agents, and the rise of new projects like OpenClaw. These developments are reshaping AI capabilities and adoption in both industry and hobbyist circles.

In the past six months, the landscape of large language models (LLMs) has experienced rapid shifts, with multiple models overtaking each other in performance, significant improvements in coding agents, and the emergence of new AI projects capturing widespread attention.

Starting in November 2025, the ‘best’ LLM shifted hands five times among major providers, with models like Claude Sonnet 4.5, GPT-5.1, Gemini 3, and Claude Opus 4.5 all vying for dominance. Notably, the coding agents developed by OpenAI and Anthropic achieved a level where they could reliably perform daily work tasks, marking a significant step forward in AI-assisted coding.

During the holiday season, enthusiasts experimented with these models, leading to ambitious projects such as running JavaScript in browsers via complex WebAssembly setups. In February 2026, the project OpenClaw emerged as a leading ‘personal AI assistant,’ gaining rapid popularity and even prompting hardware sales for running these models locally. Simultaneously, models like Gemini 3.1 Pro and Google’s Gemma 4 series demonstrated impressive capabilities, including generating detailed and animated images, such as pelicans riding bicycles and other whimsical scenes. Chinese AI lab GLM released the massive GLM-5.1, a 1.5TB open-weight model, which produced highly competent outputs but required substantial hardware to operate.

Why It Matters

These developments matter because they signal a rapid acceleration in AI capabilities, especially in coding and creative tasks, making AI tools more accessible and practical for daily use. The shift in model performance and the rise of customizable, locally run AI assistants like OpenClaw suggest a democratization of advanced AI, potentially transforming industries, hobbyist projects, and personal productivity. The ongoing competition among providers also indicates a fast-moving, innovation-driven market that could reshape AI deployment and ethics considerations.

Amazon

mini PC for running AI models

As an affiliate, we earn on qualifying purchases.

Background

Since late 2024, the AI community has observed a series of milestones, including the deployment of reinforcement learning techniques to improve code quality and the emergence of more capable open-weight models. The November 2025 inflection point marked a turning point where models began to outperform previous benchmarks consistently. The rise of projects like Warelay and later OpenClaw reflected a broader shift towards user-friendly, customizable AI assistants. This period also saw increased hardware sales, as users sought to run these models locally, driven by the models’ growing capabilities and the desire for privacy and control.

“The last six months have seen AI models change hands five times in terms of performance, with coding agents reaching a new level of reliability and usability.”

— Simon Willison

“Mac Minis are selling out because people are using them to run their Claws—these AI assistants are becoming the new digital pets.”

— Drew Breunig

“We’ve seen models generate highly detailed scenes, from pelicans on bicycles to animated scenes, showing the expanding creative capabilities of LLMs.”

— Jeff Dean (Google)

Amazon

local AI assistant hardware

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It remains unclear how sustained these performance gains will be, whether new models will continue to outperform existing ones, and how the broader AI ecosystem will address ethical and safety concerns related to increasingly powerful models. The long-term impact of local AI deployment versus cloud-based models is also still developing.

Amazon

WebAssembly development kit

As an affiliate, we earn on qualifying purchases.

What’s Next

Next steps include the continued evolution of model performance, wider adoption of local AI assistants, and potential regulatory discussions as capabilities expand. Expect further model releases, more sophisticated coding agents, and possibly new breakthroughs in multimodal AI integration.

AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch

As an affiliate, we earn on qualifying purchases.

Key Questions

What caused the rapid shifts in the top-performing models over the past six months?

The competitive landscape, breakthroughs in training techniques, and the release of new models from major labs contributed to frequent performance changes.

How accessible are these new models for everyday users?

Many models are now available as open weights or through APIs, with projects like OpenClaw enabling local deployment on consumer hardware.

What are the implications for AI safety and ethics?

The rapid advancement raises concerns about misuse, biases, and safety, prompting ongoing discussions among researchers and policymakers.

Will these trends continue in the coming months?

While current momentum suggests continued progress, uncertainties remain about hardware limitations, regulatory impacts, and the long-term sustainability of rapid innovation.

The last six months in LLMs in five minutes

Up next

Hyperpolyglot Lisp: Common Lisp, Racket, Clojure, Emacs Lisp

Author

Thorsten Meyer

Share article