OpenAI Launches GPT-5.4: Native Computer Control, 1 Million Token Context, and a New Era for AI AgentsThe biggest GPT update since gpt5

The GPT-5.4 launch brings desktop automation, professional-grade reasoning, and a reworked tool architecture — signaling OpenAI’s push to make AI a genuine co-worker.

Published March 5, 2026

OpenAI on Thursday unveiled GPT-5.4, calling it the company’s “most capable and efficient frontier model for professional work.” The release arrives in three variants — a standard API model, GPT-5.4 Thinking for reasoning-heavy tasks inside ChatGPT, and GPT-5.4 Pro for enterprise customers who need maximum performance on complex workloads. Together, they represent a significant leap in what a general-purpose language model can do.

Personal Experience: VS Code and Codex

I recently started using VS Code with Codex as the main driver, and it has worked wonders for incrementing on web development. While Gemini has done the heavy lifting designing the entire HMMüller tech front end, Codex 5.3 at very high settings has been flawless in my work. I will be trying it out with MCPs, skills, and agentic workflows over the course of the next few days.

The launch comes just two days after OpenAI teased the model on X with the cryptic message “5.4 sooner than you think,” posted within hours of the GPT-5.3 Instant release. The rapid succession underscores an accelerating iteration cycle, likely driven by competitive pressure from Anthropic’s Claude 4 series and Google’s Gemini lineup.

The Headline Feature: Native Computer Use

GPT-5.4 is the first general-purpose model from OpenAI that can directly control a desktop environment. It reads screenshots, issues keyboard commands, and operates a mouse — enabling it to navigate software, fill out forms, and execute multi-step workflows across different applications without relying on a separate specialised agent.

The timing is no accident. Apple recently integrated agentic AI coding support in Xcode 26.3, and OpenAI launched a dedicated Codex app for macOS. Developers can now direct GPT-5.4 to handle complex coding tasks across local applications, with configurable safety guardrails. The model also processes dense, high-resolution images more accurately, improving document parsing and click precision when operating software.

The model posted record scores on the OSWorld-Verified and WebArena Verified benchmarks, both of which test real-world computer use — a signal that this is not a gimmick but a core capability.

A Million-Token Context Window

The API version of GPT-5.4 supports context windows of up to one million tokens — by far the largest OpenAI has ever offered. This is more than double the 400,000-token limit of GPT-5.3, and puts the model on par with long-context offerings from Google and Anthropic. The expanded window enables processing of large codebases, lengthy legal documents, and complex multi-step agent workflows without losing context.

Improved Accuracy and Reduced Hallucinations

OpenAI reports measurable gains in factual reliability. Compared to GPT-5.2, the new model is 33% less likely to produce errors in individual claims, and full responses are 18% less likely to contain factual mistakes overall. The model also scored 83% on OpenAI’s GDPval benchmark for knowledge work tasks — a new record — and took the top spot on Mercor’s APEX-Agents benchmark for professional skills in law and finance.

Key Performance Metrics

Max Context Window (API): 1,000,000 tokens
Claim-Level Error Reduction: 33% fewer errors vs. GPT-5.2
Response-Level Accuracy: 18% fewer factual mistakes
GDPval Score: 83% (record)
API Pricing (Standard): $2.50 / $15 per M tokens (in/out)
API Pricing (Pro): $30 / $180 per M tokens (in/out)

GPT-5.4 Thinking: Interactive Reasoning

Inside ChatGPT, the Thinking variant introduces a new interaction pattern: the model generates an upfront plan of its reasoning before writing a full response. Users can interrupt and redirect the model mid-response if it’s heading in the wrong direction, arriving at a better output without starting over. OpenAI says this is particularly useful for deep web research, where the model now maintains context better when searching across multiple rounds of information gathering.

GPT-5.4 Thinking is rolling out today for ChatGPT Plus, Team, and Pro subscribers, replacing GPT-5.2 as the default reasoning model. The older model will remain accessible in a legacy menu for three months before being retired in June 2026.

Tool Search: A Smarter Approach to Tool Calling

One of the more technically significant changes is a new system called Tool Search. Previously, every API request had to include full definitions for all available tools in the system prompt — an approach that consumed large numbers of tokens as tool libraries grew. Tool Search replaces this with a lightweight index: the model only pulls a tool’s full definition into context when it actually needs to use it. OpenAI says this can cut token usage by nearly half in complex, multi-step workflows, making agentic applications both faster and cheaper to run.

Pricing and Availability

The standard GPT-5.4 API model is priced at $2.50 per million input tokens and $15 per million output tokens. The Pro variant commands a substantial premium at $30 input / $180 output per million tokens, aimed squarely at enterprise and academic customers tackling the most demanding workloads.

ChatGPT users on Plus, Team, and Pro plans get access to GPT-5.4 Thinking starting today on web and Android, with iOS support expected soon. OpenAI also launched a dedicated ChatGPT for Excel add-in for enterprise customers alongside the model release.

Safety: Chain-of-Thought Transparency

OpenAI included a new safety evaluation specifically targeting chain-of-thought deception — the concern that reasoning models could misrepresent their internal thought process. The company says its testing shows that deception is less likely in GPT-5.4 Thinking than in prior models, concluding that chain-of-thought monitoring remains an effective safety tool for this generation of models.

The Competitive Landscape

GPT-5.4 enters a fiercely competitive market. Anthropic’s Claude Opus 4.6 has been gaining traction among developers and power users, while Google’s Gemini 3.1 lineup offers comparable long-context capabilities. The native computer-use feature puts OpenAI in direct competition with Anthropic’s own computer use offering, which launched months earlier.

What’s notable is the pace. With GPT-5.3 Codex, GPT-5.3 Instant, and now GPT-5.4 all shipping within weeks of each other, OpenAI appears to have shifted to a near-monthly model release cadence — a dramatic change from the slower rollouts of earlier years.

The Bottom Line

GPT-5.4 is not a minor refresh. Native computer control, a 1-million-token context window, dramatically improved accuracy, and a rethought tool architecture make it a genuinely significant release. For developers building AI agents and automated workflows, the combination of computer use and Tool Search opens up new categories of applications. For everyday ChatGPT users, the interactive reasoning in the Thinking variant should make complex tasks feel considerably less frustrating.

Whether GPT-5.4 can hold its lead in an increasingly crowded field remains to be seen — but for today, it sets a new bar for what a general-purpose AI model can do.