2026-03-21Internet ArchiveAI copyrightweb preservationNew York TimesEFFfair use

The New York Times just blocked the web's biggest library

The NYT and The Guardian are blocking the Internet Archive over AI fears. The EFF warns this won't stop AI scraping — but it will erase 1 trillion web pages of history.

The New York Times has started blocking the Internet Archive — the nonprofit that has preserved over 1 trillion web pages for nearly 30 years. The Guardian is following suit. Their reason? Fear of AI.

The Electronic Frontier Foundation (EFF) just published a pointed response: blocking the Archive won't stop AI companies from scraping content — but it will erase the web's historical record.

Internet Archive server racks storing over 1 trillion web pages

What the Internet Archive actually does

If you've ever clicked a "Wayback Machine" link to see what a website looked like years ago, you've used the Internet Archive. It's the web's memory — a nonprofit that has been crawling and saving web pages since the mid-1990s.

The numbers show how essential it has become:

• 1 trillion+ archived web pages

• 2.6 million news articles linked from Wikipedia across 249 languages

• Nearly 30 years of continuous preservation

• Used daily by journalists, researchers, and courts to verify facts and find deleted content

When a news article gets deleted, corrected, or a website shuts down, the Wayback Machine is often the only place that page still exists. Without it, links across Wikipedia, academic papers, and legal filings would go permanently dead.

Why publishers are blocking it

Major news organizations are worried that AI companies — like OpenAI, Google, and Anthropic — are using their content to train AI models without paying for it. That's a real and valid concern.

But instead of targeting AI companies directly, some publishers have started using technical measures that go beyond standard robots.txt rules (the traditional "do not crawl" signal websites can set) to block all automated crawling — including the Internet Archive's preservation work.

The problem, as the EFF argues: the Internet Archive isn't building AI. It's a nonprofit that preserves history. Blocking it doesn't slow down OpenAI or Google — they have their own crawlers and data pipelines. It only erases the historical record.

Mark Graham's response

Mark Graham, the Wayback Machine's director, pushed back directly in a February 2026 blog post, calling the AI scraping fears "unfounded" when applied to the Archive. He noted that the Archive actively works to prevent abuse of its systems.

Tech writer Mike Masnick warned that this blocking represents "a mistake we're going to regret for generations."

The legal argument the EFF is making

The EFF's core argument is straightforward: archiving and search are well-established fair use under U.S. law. Courts already recognized this when Google was allowed to copy entire books for its search index.

As EFF's Joe Mullin wrote: "Even if courts place limits on AI training, the law protecting search and web archiving is already well established."

In other words: even if AI training on copyrighted material gets restricted, that shouldn't affect the Archive's mission to preserve the web.

The deeper problem: If publishers block the Archive to fight AI, huge portions of the web's history disappear. Wikipedia alone would lose access to 2.6 million preserved news articles. Legal cases that rely on archived web evidence would be weakened. And the public loses its ability to hold publishers accountable for what they've written.

What you can do about it

If you care about preserving the open web, the EFF suggests a few things:

Use the Wayback Machine — web.archive.org — and save important pages before they disappear
Support the Internet Archive — it's a nonprofit that runs on donations
Push back on publishers who conflate preservation with AI scraping

The AI copyright debate is important and unresolved. But as the EFF warns: "If publishers shut the Archive out, they aren't just limiting bots. They're erasing the historical record."

Related Content — Get Started with Easy Claude Code | Free Learning Guides | More AI News

Sources

Stay updated on AI news

Simple explanations of the latest AI developments