GPT-5.4 drops with computer use — 60% cheaper than Claude
GPT-5.4 includes a native Computer Use API, scores 75% on desktop automation benchmarks, and costs 60% less than Claude Opus 4.6 on output tokens.
GPT-5.4 landed on March 5, 2026, and it isn't just another incremental update. OpenAI's most capable model to date introduces two features that change how the AI pricing war plays out: a built-in Computer Use API (an interface that lets the model see your screen, click buttons, type text, and scroll — controlling your desktop like a human would) and output token prices that are 60% cheaper than Claude Opus 4.6. At $30 per million output tokens versus Anthropic's $75, teams running high-volume workflows just got a compelling reason to re-evaluate their stack.
Available simultaneously in ChatGPT Plus ($20/month), ChatGPT Pro ($200/month), Enterprise plans, and the OpenAI API, GPT-5.4 is live right now — no waitlist, no preview access required.
What GPT-5.4 Can Actually Do
The headline feature is Computer Use — and GPT-5.4 doesn't just attempt it, it outperforms humans at it. On OSWorld-Verified (a standardized test where AI models complete real desktop tasks like filling spreadsheets, managing files, and navigating web apps), GPT-5.4 scored 75.0% — surpassing the human baseline of 72.4%. This makes it the first general-purpose OpenAI model that's genuinely viable for automating desktop workflows without a developer writing custom scripts.
For coding, it scores 95.1% on HumanEval (the standard benchmark for AI programming ability — higher score = fewer mistakes writing real code), edging past Claude Opus 4.6 at 94.6%. On SWE-bench Verified (a tougher test using real open-source software bugs from GitHub), it hits ~80.0%, just behind Claude Opus 4.6's 80.8%.
The math and reasoning numbers are eye-catching too: 97.2% on MATH-500 at maximum reasoning effort, and 81.2% on MMMU-Pro (a visual reasoning benchmark — think reading charts, diagrams, and mixed image-text problems). Across 44 professions covering the top 9 US GDP-contributing industries, professional evaluators rated GPT-5.4's output as better than human expert output in 83% of cases.
The Pricing That Flips the Calculation
GPT-5.4 vs Claude Opus 4.6 — Cost Per Million Tokens
- GPT-5.4 Standard: $10 input / $30 output
- Claude Opus 4.6: $15 input / $75 output
- GPT-5.4 Pro Tier: $30 input / $180 output (for compute-intensive enterprise use)
- Savings on output: 60% cheaper at standard tier
A team sending 10 million output tokens per day saves roughly $450,000 per year switching from Claude Opus 4.6 to GPT-5.4 standard — at similar benchmark performance.
There's also a Tool Search feature that delivers 47% token savings on tool-heavy workflows (situations where the AI calls external services like web search or databases repeatedly). For agentic pipelines (automated sequences where AI takes multiple steps to complete a task), that's a significant efficiency gain on top of the already-lower base price.
Five Levels of Reasoning — You Control the Cost
GPT-5.4 introduces a configurable reasoning dial with five settings: none, low, medium, high, and xhigh. Think of it like choosing between a quick answer and a deep analysis. For simple tasks — classifying emails, summarizing a paragraph — use low to minimize API costs. For complex multi-step problems like debugging tricky code or solving advanced math, crank it to xhigh for the best accuracy. The 97.2% MATH-500 score was measured at xhigh reasoning effort.
This is especially useful for developers building products that mix routine tasks (cheap) with occasional hard problems (expensive) — you can tune cost per request rather than paying peak rates for everything.
Context Window: 272K Standard, 1.05M for Extended Workloads
The standard context window (the maximum amount of text the model can read and remember in one session — think of it as working memory) is 272,000 tokens, roughly equivalent to 200,000 words or 3–4 full novels. For teams that need even more, the Codex tier extends this to 1.05 million tokens, though inputs over 272K are billed at 2× the base rate. Visual inputs support up to 10.24 million pixels per image — high-res diagrams, schematics, and screenshots are fully readable.
Who Benefits Most — and What to Do Right Now
If you're building AI products: GPT-5.4's pricing makes it the most cost-competitive frontier model for high-output production use. The reasoning effort levels let you build tiered cost structures inside your product — cheap for simple queries, expensive only when needed.
If you're using AI for desktop automation: Computer Use is now production-ready. GPT-5.4 can browse websites, fill forms, operate software, and extract data from desktop apps without custom integration work. For non-technical users, this means describing a workflow in plain English and having the AI execute it.
If you're comparing Claude vs OpenAI: For raw coding, Claude Opus 4.6 still has a slight SWE-bench edge (80.8% vs 80.0%). For cost-per-result across mixed workloads, GPT-5.4 now wins. For computer control, GPT-5.4 is the only option with a built-in native API.
Start Using GPT-5.4 via API
pip install openai
from openai import OpenAI
client = OpenAI()
# Standard completion with reasoning control
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Your task here"}],
reasoning_effort="high" # none | low | medium | high | xhigh
)
print(response.choices[0].message.content)
ChatGPT Plus and Pro users can access GPT-5.4 directly at chat.openai.com — switch the model selector to GPT-5.4. API access is live at platform.openai.com.
Related Content — Get Started with Easy Claude Code | Free Learning Guides | More AI News
Sources
Stay updated on AI news
Simple explanations of the latest AI developments