2026-03-23AI securitycybersecurityUK AISIAI safetyhacking

AI just got 6x better at hacking — in only 18 months

The UK's AI Security Institute tested frontier models on a 32-step corporate network attack. Opus 4.6 completed 9.8 steps vs GPT-4o's 1.7 — a sixfold jump in 18 months.

The UK government just tested whether today's smartest AI can hack a corporate network — and the results are alarming. On a simulated 32-step attack chain, the best AI in August 2024 completed just 1.7 steps. By February 2026, the best AI completed 9.8 steps. That's a sixfold improvement in 18 months.

The research comes from the UK AI Security Institute (AISI), the government body responsible for evaluating the safety of frontier AI. Their new paper, published on arXiv, is the most detailed public assessment of AI cyberattack capabilities to date.

Chart showing AI progress on 32-step corporate network attack — dramatic improvement from GPT-4o to Opus 4.6

The attack scenario: 32 steps to full compromise

AISI built a custom test called "The Last Ones" — a simulated corporate network with real attack chains designed by cybersecurity experts. To fully compromise it, an attacker needs to complete 32 sequential steps: stealing credentials, exploiting web services, reverse-engineering binaries (taking apart software to find hidden access), escalating privileges (gaining administrator-level access), and more.

No active defenders were present — the test purely measures how far AI can get on its own, given enough time and computing power.

How each AI model performed

GPT-4o (Aug 2024) at 10M tokens: 1.7 steps average

Claude Opus 4.6 (Feb 2026) at 10M tokens: 9.8 steps average

Claude Opus 4.5 at 100M tokens: 11.0 steps

Claude Opus 4.6 at 100M tokens: 15.6 steps — a 42% jump over 4.5

Best single run: 22 of 32 steps completed

That best run reached milestone 6, which requires reverse engineering a Windows service binary, escalating privileges via token impersonation (pretending to be an authorized user), and recovering encryption keys to access a command-and-control server. That's the kind of work that takes a skilled human security expert hours.

More computing power = more hacking ability

Graph showing AI cyberattack performance scaling with more computing tokens

One of the most concerning findings: giving AI more computing power directly translates to more hacking ability. Increasing the token budget (how much the AI can "think") from 10 million to 100 million tokens improved performance by up to 59%. This creates what researchers call a "scaling law for cyberattacks" — as AI gets cheaper to run, attacks get easier to execute.

AISI also tested a second scenario called "Cooling Tower" — a 7-step attack on an industrial control system (the kind that manages power plants and water treatment facilities). Here, AI made less progress: Opus 4.6 averaged only 1.4 steps, with a maximum of 3. Critical infrastructure attacks remain harder, but the trajectory is clear.

What this actually means for you

The good news: No AI can fully hack a corporate network autonomously — yet. The best run completed 22 of 32 steps, and the average is far lower.

The concerning part: The improvement rate. If this trajectory continues, AI could match a mid-level human penetration tester within 1-2 years — at a fraction of the cost.

The practical takeaway: AI-assisted cyberattacks are becoming cheaper and more accessible. Basic security hygiene — strong passwords, two-factor authentication, regular updates — matters more than ever.

The numbers at a glance

6x — improvement in AI hacking capability over 18 months

22/32 — steps completed in the best single AI attack run

59% — performance gain from 10x more computing power

42% — jump from Opus 4.5 to 4.6 (released ~2 months apart)

7 — frontier AI models tested over 18 months

The full AISI blog post and research paper are available at aisi.gov.uk.

Related Content — Get Started with Easy Claude Code | Free Learning Guides | More AI News

Sources

Stay updated on AI news

Simple explanations of the latest AI developments