AI for Automation
Back to AI News
2026-04-16Claude Sonnet 4.6AnthropicAI automationClaude API1M token context windowmodel deprecationdeveloper toolsClaude update 2026

Claude Sonnet 4.6: Free 1M Token Window — Sonnet 4 Retires

Claude Sonnet 4 retires June 15, 2026. Upgrade free: Sonnet 4.6 delivers 1M token context, 5x more capacity, and 600 images/request at the same price.


Anthropic has formally deprecated Claude Sonnet 4 and Claude Opus 4, setting a hard retirement date of June 15, 2026. That gives developers using either model fewer than 60 days to migrate their apps — after that date, all requests to those model strings return errors with no fallback. The upgrade path (Sonnet 4.6 and Opus 4.6) delivers a concrete capability jump at the same price tier.

Starting this month, the 1 million token context window (approximately 750,000 words — the equivalent of loading about 4 full-length novels into a single request) is generally available for Sonnet 4.6 and Opus 4.6 at standard pricing. No special beta flags, no price premium. For developers processing large codebases, legal documents, or long agentic workflows, this single change may be Anthropic's most impactful free capability upgrade in 2026.

Claude Sonnet 4.6 by Anthropic — 1M token context window upgrade replacing Claude Sonnet 4, retirement June 2026

Three Active Claude API Deadlines — One Expires in 3 Days

The retirement timeline is more aggressive than the June 15 headline suggests. Three separate cutoffs are now active at once:

  • April 19, 2026 (3 days away): Claude Haiku 3 retires completely. Any production application calling claude-haiku-3-20240307 will begin failing. Migrate to Haiku 4.5 immediately — this is the most urgent action item in this entire release.
  • April 30, 2026: The 1M token context window beta support ends for Sonnet 4.5 and Sonnet 4. If you are accessing the 1M window on these older models via a beta header, that path closes at the end of the month. Sonnet 4.6 is the only supported path to the 1M window after April 30.
  • June 15, 2026: Claude Sonnet 4 (claude-sonnet-4-20250514) and Claude Opus 4 (claude-opus-4-20250514) fully retire. All requests to these model IDs return errors.

Note that Claude Sonnet 3.7 and Claude Haiku 3.5 are already gone — those endpoints return errors today. Claude Opus 3 is also fully retired, though researchers can apply for continued access via Anthropic's External Researcher Access Program.

What the Claude Sonnet 4.6 Upgrade Actually Gives You

This is not a version number shuffle. Moving from Sonnet 4 to Sonnet 4.6 involves three concrete capability gains that matter in production:

The 1M Token Window at Zero Extra Cost

A "token" (the unit AI models use to read and process text — roughly three-quarters of an English word) context window of 1,000,000 means Claude can hold the equivalent of a small book, a large codebase, or a 10-hour conversation transcript in active memory during a single request. The previous standard was 200,000 tokens — a 5x increase with no price change.

Three additional things change alongside this upgrade:

  • Media support jumps from 100 to 600 images or PDF pages per request when using 1M context — critical for document-heavy pipelines processing annual reports, contracts, or image-rich datasets simultaneously
  • Rate limits are unified — the separate dedicated rate limit bucket for 1M context requests is eliminated; your standard account limits apply uniformly across all context lengths
  • No beta header required — the feature is now built into the model directly, reducing operational overhead in production deployments that previously needed header management

300,000 Output Tokens on Batch Jobs

The Message Batches API (Anthropic's async bulk-processing system — used for running thousands of Claude requests in parallel at reduced cost, typically overnight or in background queues) now supports up to 300,000 output tokens per message for Opus 4.6 and Sonnet 4.6. This requires the output-300k-2026-03-24 beta header, but it unlocks generation of full code repositories, lengthy regulatory filings, or large structured datasets in a single batch item — a major limit that previously forced developers to split long outputs across multiple calls.

2.5x Faster Output in Fast Mode

Opus 4.6 ships with a fast mode research preview that delivers up to 2.5x faster output token generation at premium pricing (exact rates not yet publicly published). This is designed for latency-sensitive agentic steps where you need Opus-level intelligence but cannot wait for standard inference delays — useful in real-time assistant UIs or multi-step reasoning chains where every second of wait time is visible to users.

Three New Claude Beta Tools for AI Automation

Beyond the model retirements, Anthropic released several new infrastructure tools specifically for building production-grade agentic systems:

Managed Agents — Execution Fully Handled by Anthropic

Claude Managed Agents (now in public beta) provides a secure execution sandbox (an isolated environment where Claude runs tools and code without touching your main infrastructure or requiring your team to manage containers) with SSE streaming (Server-Sent Events — a protocol for receiving real-time output as it generates, rather than waiting for the full response to complete), built-in tool access, and Anthropic-managed security guardrails. For developers currently managing their own agent orchestration loops, this offloads that complexity entirely to Anthropic's infrastructure.

Advisor Tool — High-Intelligence Oversight Mid-Generation

The Advisor Tool (public beta) pairs a fast, lighter "executor" model with a slower, more intelligent "advisor" model that provides strategic guidance during generation. Per Anthropic's documentation: "Pair a faster executor model with a higher-intelligence advisor model that provides strategic guidance mid-generation." This two-model architecture targets long-horizon agentic tasks (workflows that run for minutes or hours with many branching decisions requiring judgment) where neither pure speed nor pure accuracy alone meets the production requirement. Think of it as a fast junior developer writing code while a senior architect reviews each step in real time.

Compaction API — Effectively Unlimited Conversation Length

The Compaction API (beta) addresses context overflow in long conversations by performing server-side summarization — compressing earlier messages into a compact digest while preserving key facts and decisions. Anthropic describes this as enabling "effectively infinite conversations." The Python and TypeScript SDKs now also include client-side compaction built into the tool_runner utility, handling context management automatically without requiring custom summarization code from developers.

Anthropic platform.claude.com developer hub — Claude Sonnet 4.6 and Opus 4.6 AI automation and agentic workflow documentation

How to Migrate to Claude Sonnet 4.6 Before the Cutoffs

For most developers, migration is a single-line model string update. Here is the complete set of swaps for all three active retirement paths:

# URGENT: Haiku 3 retires April 19, 2026 (3 days)
# Change this:
model = "claude-haiku-3-20240307"
# To this:
model = "claude-haiku-4-5-20251001"

# Sonnet 4 retires June 15, 2026
# Change this:
model = "claude-sonnet-4-20250514"
# To this:
model = "claude-sonnet-4-6-20260401"

# Opus 4 retires June 15, 2026
# Change this:
model = "claude-opus-4-20250514"
# To this:
model = "claude-opus-4-6-20260401"

The new ant CLI (Anthropic's command-line tool for interacting directly with Claude — uses YAML files to version and share prompt configurations across your team) can help test the new models before switching production traffic. New to the Claude API? Our Claude API setup guide walks through environment configuration and authentication:

npm install -g @anthropic-ai/cli
ant run --model claude-sonnet-4-6-20260401 --prompt "Summarize this document"

One breaking change requires attention: Opus 4.6 no longer supports prefilling (the technique of sending a partial assistant message to steer Claude's response format or tone). If your production prompts rely on the assistant prefill pattern, test thoroughly on Opus 4.6 before June 15 — this change has no backward-compatible workaround.

Finally, note that console.anthropic.com now redirects to platform.claude.com as part of Anthropic's brand consolidation. If you have hardcoded console URLs in internal wikis, CI pipelines, or monitoring dashboards, update those references now. Set a calendar reminder for May 30 — two weeks before the June 15 cutoff — to complete and test your model string migrations before the final deadline arrives.

Related ContentGet Started | Guides | More News

Stay updated on AI news

Simple explanations of the latest AI developments