TL;DR

Anthropic’s Frontier Red Team mapped 832 accounts banned for malicious cyber activity from March 2025 to March 2026 onto MITRE ATT&CK and found that technique count did not consistently correlate with attacker risk in the sample. The analysis says agentic AI systems that can chain attack steps with limited human input may be a more relevant risk indicator, but that behavior is not captured directly in the current taxonomy.

Anthropic’s Frontier Red Team has reported that 832 accounts banned for malicious cyber activity over a 12-month period show limitations in how security teams measure AI-enabled cyber threats, with the company finding that traditional counts of attacker techniques did not reliably separate lower-risk actors from higher-risk actors in the sample.

The analysis, summarized by Thorsten Meyer AI and attributed to Anthropic, mapped malicious activity from March 2025 through March 2026 onto MITRE ATT&CK, a widely used taxonomy for describing adversary behavior. Anthropic said the accounts examined were cases with enough detail to evaluate techniques thoroughly, making the dataset a window into observed misuse rather than a full census of all AI-enabled cyber activity.

According to the report summary, 67.3% of the banned accounts, or 560 accounts, used AI to help write malware. A smaller share, 6.5%, or 54 accounts, used AI in lateral movement, a post-compromise stage in which an attacker moves through a network after gaining access. Anthropic also found that the share of medium-or-higher-risk actors rose from 33% in the first six months of the period to 56% in the second half.

The analysis found that technique count, long used as a shorthand for attacker capability, was less predictive in this sample. The report said the least-skilled actors used 16 techniques while the most-skilled used 20, a narrow gap that complicates the use of technique counts as a measure of risk. The analysis also said the platform used, including Claude Code, API access, or chat, did not correlate with risk.

ThorstenMeyerAI.com

AI & Security · Field Note

AI-enabled cyber threats · a year mapped

The frameworks can’t see the thing that matters

For decades, danger meant which techniques an attacker commands. A year of real AI-enabled attacks — 832 banned accounts mapped onto MITRE ATT&CK — shows that signal breaking, just as a new, harder-to-see one takes over.

Anthropic Frontier Red Team · Mar 2025–Mar 2026 · 832 accounts · via Verizon DBIR

01The dataset

A year of real misuse, mapped to the standard taxonomy

A window, not a census — these are the cases with enough detail to assess techniques thoroughly. Inside it, the risk level climbed fast.

WHAT WAS STUDIED

832 accounts

Banned for malicious cyber activity, Mar 2025–Mar 2026, mapped onto MITRE ATT&CK. The most common AI use was prep — 67.3% (560) used AI to help write malware; 6.5% (54) for lateral movement deep inside networks.

THE RISK CLIMB · MEDIUM-OR-HIGHER ACTORS

First 6 months33%

33%

Second 6 months56%

56%

≈ 1.7× increase in a single year

02The measurement breaks · press play

Amazon

AI malware detection tools

As an affiliate, we earn on qualifying purchases.

Artificial Intelligence for Cybersecurity: How AI Detects Cyber Threats, Prevents Hacking, and Protects Your Data, Identity, and Smart Devices (AI Cybersecurity Mastery Series)

As an affiliate, we earn on qualifying purchases.

“More techniques” stopped meaning “more dangerous”

The old heuristic: count the techniques, judge the tooling. AI dissolved it — because the model supplies the techniques either way. Watch the old signal fail, then watch what it misses.

Risk score vs. technique count

Two ways to read the same attacker. One is going blind. Press play.

the old signalSkill ≈ number of techniques?

Least-skilled

Most-skilled

16 vs. 20. A novice and an expert now look almost alike by technique-count — and the platform (Claude Code / API / chat) didn’t correlate with risk either.

what it missesThe Nov 2025 espionage operation

by technique count

techniques · 13 tactics

Looks like many medium-risk actors. Unremarkable.

by risk-scoring methodology

100

max risk score

The model ran as an autonomous agent — same case.

The most dangerous attribute of the year’s most dangerous attack is taxonomically invisible. ⌁ there is no MITRE ATT&CK ID for agentic orchestration

03Where the AI moved

Amazon

cyber threat intelligence software

As an affiliate, we earn on qualifying purchases.

Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software

As an affiliate, we earn on qualifying purchases.

Deeper into the attack — and into less-skilled hands

Across the year, AI use drifted from getting in toward acting once already inside — the operationally demanding stages that used to require an expert.

The attack lifecycle · where AI is now applied

The center of gravity moved right — toward post-compromise work.

Initial access

phishing, getting in

Account discovery

finding valid accounts

Lateral movement

navigating the network

Privilege escalation

deeper control

↓ 8.6%

AI-assisted phishing

A classic way to gain access — falling.

↑ 8.9%

AI for account discovery

Post-compromise work — rising.

The crack in the old model: post-compromise techniques used to be restricted to actors skilled enough to perform them. AI can now perform them on behalf of less sophisticated actors — the dangerous deep stages are no longer self-limiting.

04What actually predicts danger now

Amazon

network security monitoring devices

As an affiliate, we earn on qualifying purchases.

Python Scripting for Cybersecurity: Linux Edition: Volume 1 – Beginner System Visibility Tools with Hands-On Python Projects

As an affiliate, we earn on qualifying purchases.

From “what they know” to “what they’ve built”

The report sorts the signals into three tiers — one dead, one fading, one durable.

🔢

Technique count & tooling

16 vs. 20 between novice and expert; platform doesn’t correlate. The model supplies the techniques either way.

dead signal

📍

Where in the lifecycle AI is applied

Concentrating on operationally demanding, post-compromise stages is a better signal — but it’s eroding as the whole population heads there.

fading signal

🏗️

The scaffolding around the model

Architectures that let the model chain stages and run with minimal human input. Not what they know — whether they’ve built a system that lets AI run the attack.

durable signal

05What follows · read straight

Amazon

AI-powered intrusion detection system

As an affiliate, we earn on qualifying purchases.

TP-Link Deco 7 BE23 Dual-Band BE3600 WiFi 7 Mesh Wi-Fi System | 4-Stream 3.6 Gbps, 160 Mhz | Covers up to 6,500 Sq.Ft | 2× 2.5G Ports Wired Backhaul | VPN,MLO,AI-Roaming, HomeShield, 3-Pack

𝐍𝐞𝐱𝐭-𝐆𝐞𝐧 𝐖𝐢-𝐅𝐢 𝟕 𝐰𝐢𝐭𝐡 𝟒-𝐒𝐭𝐫𝐞𝐚𝐦 𝐃𝐮𝐚𝐥-𝐁𝐚𝐧𝐝 𝐮𝐩 𝐭𝐨 𝟑.𝟔 𝐆𝐛𝐩𝐬 – Designed with the latest Wi-Fi 7 technology,…

As an affiliate, we earn on qualifying purchases.

Fixing the map before the territory moves again

A taxonomy that can’t name the most dangerous behavior on the field will quietly mislead the people relying on it. The response runs in two directions.

🛡️ defensively

Fed back into the models

The findings informed safeguards on the most capable models, built to detect & block some of what was observed:

Blocking malware development
Blocking mass data exfiltration
Putting tools in defenders’ hands first (Project Glasswing)

🧭 institutionally

Taking it to the source

Following the Verizon work, Anthropic says it’s in discussions with MITRE about how ATT&CK might evolve:

A vocabulary for agentic orchestration
Naming the scaffolding that makes a model an operator
An interactive technique visualization on the Red blog

Reading it in proportion

The 832 cases are a detailed subset, not the full population — the precise percentages are directional, not definitive.
“More autonomous” is not “fully autonomous” — even the standout case needed human input at key moments, which is itself a place for defenders to intervene.
This is one vendor’s window — the company with visibility into misuse of its own model, publishing what it found. The right thing to do with the data, and worth remembering as you read it.

ThorstenMeyerAI.com

Source: Anthropic, “What we learned mapping a year’s worth of AI-enabled cyber threats” (Jun 3, 2026) · Frontier Red Team · Verizon 2026 DBIR · figures per the report · independent commentary · findings only, no operational detail.

Why It Matters

Security teams, vendors, and incident responders use taxonomies such as MITRE ATT&CK to classify attacker behavior, compare campaigns, and prioritize defenses. Anthropic’s analysis says a framework focused on visible attack steps may not fully account for how AI systems coordinate those steps.

The report highlighted a November 2025 espionage operation that involved 30 techniques across 13 tactics. The summary says that, under a technique-based view, the operation resembled many medium-risk actors, while Anthropic’s risk-scoring methodology rated it at the maximum score because the model operated as an autonomous agent.

That distinction places more emphasis on the systems built around the model. The report identifies surrounding architecture, or scaffolding, that lets a model chain stages and run with limited human input as an important indicator of risk.

Background

For years, many threat evaluations have treated breadth of technique, tooling sophistication, and attacker fluency as useful proxies for capability. That approach was developed in an environment where operationally demanding phases of an intrusion, such as account discovery, lateral movement, and privilege escalation, generally required higher human skill.

Anthropic’s analysis says that pattern is changing as AI is used deeper in the attack lifecycle. The summary says AI-assisted phishing fell by 8.6%, while AI use for account discovery rose by 8.9%, pointing toward more post-compromise use. The report frames that shift as a sign that less-skilled actors may now be able to attempt stages that once required more experience.

The report also says Anthropic fed the findings into safeguards for more capable models, including efforts to block malware development and mass data exfiltration. Anthropic is also described as being in discussions with MITRE about how ATT&CK might evolve to account for agentic orchestration and model scaffolding.

“More techniques stopped meaning more dangerous.”
— Thorsten Meyer AI summary of the Anthropic report

“There is no MITRE ATT&CK ID for agentic orchestration.”
— Thorsten Meyer AI summary

What Remains Unclear

The dataset does not represent all malicious AI-enabled cyber activity. The source material says the 832 accounts were cases with enough detail to map techniques thoroughly, so the findings should be read as evidence from a detailed subset rather than a full measure of the threat landscape.

It is also not yet clear how MITRE ATT&CK or related taxonomies may change, whether agentic orchestration will receive a formal vocabulary, or how quickly defenders can integrate that signal into daily threat scoring.

What’s Next

Anthropic says the findings are being used in model safeguards and defender-focused tooling, while discussions with MITRE may shape how future frameworks describe AI systems that coordinate attack steps. Security teams are likely to watch whether future reports confirm the same pattern across larger datasets and other model providers.

Key Questions

What did Anthropic study?

Anthropic studied 832 accounts banned for malicious cyber activity from March 2025 to March 2026 and mapped the observed behavior onto MITRE ATT&CK.

What is the main finding?

The report says counting attacker techniques no longer tracks danger well in AI-enabled activity because models can supply techniques to less-skilled users. The stronger signal is whether the attacker has built systems that let AI coordinate and chain attack stages.

Why does agentic orchestration matter?

Agentic orchestration refers to AI systems acting across multiple steps with limited human input. In the report’s highest-risk case, that autonomy drove the risk score even though technique count alone made the operation look less exceptional.

Does this mean MITRE ATT&CK is obsolete?

No. The report does not say the framework has no value. It says the framework may need new vocabulary for AI-specific behavior, especially model scaffolding and autonomous coordination.

What remains unknown?

It remains unclear how common these patterns are outside the examined accounts, how threat taxonomies will adapt, and how defenders will convert agentic behavior into repeatable detection and scoring practices.

Source: Thorsten Meyer AI

The Frameworks Can’t See the Thing That Matters: A Year of AI-Enabled Cyber Threats

Up next

Author

TechieUS Team

Share article

The frameworks can’t see the thing that matters

A year of real misuse, mapped to the standard taxonomy

WHAT WAS STUDIED

THE RISK CLIMB · MEDIUM-OR-HIGHER ACTORS

AI malware detection tools

Artificial Intelligence for Cybersecurity: How AI Detects Cyber Threats, Prevents Hacking, and Protects Your Data, Identity, and Smart Devices (AI Cybersecurity Mastery Series)

“More techniques” stopped meaning “more dangerous”

Risk score vs. technique count

cyber threat intelligence software

Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software

Deeper into the attack — and into less-skilled hands

The attack lifecycle · where AI is now applied

network security monitoring devices

Python Scripting for Cybersecurity: Linux Edition: Volume 1 – Beginner System Visibility Tools with Hands-On Python Projects

From “what they know” to “what they’ve built”

Technique count & tooling

Where in the lifecycle AI is applied

The scaffolding around the model

AI-powered intrusion detection system

TP-Link Deco 7 BE23 Dual-Band BE3600 WiFi 7 Mesh Wi-Fi System | 4-Stream 3.6 Gbps, 160 Mhz | Covers up to 6,500 Sq.Ft | 2× 2.5G Ports Wired Backhaul | VPN,MLO,AI-Roaming, HomeShield, 3-Pack

Fixing the map before the territory moves again

Fed back into the models

Taking it to the source

Reading it in proportion

Why It Matters

Background

What Remains Unclear

What’s Next

Key Questions

What did Anthropic study?

What is the main finding?

Why does agentic orchestration matter?

Does this mean MITRE ATT&CK is obsolete?

What remains unknown?

You May Also Like