GPT-5.4, released by OpenAI on March 5, 2026, is the first general-purpose AI model with native computer-use capabilities — the ability to see a screen, move a cursor, click, type, and execute multi-step workflows without wrapper code or external tools. It scored 75.0% on OSWorld-Verified, surpassing the human expert baseline of 72.4%, making it the strongest computer-operating AI model available. 1)
GPT-5 was first released on August 7, 2025, as OpenAI's flagship multimodal model excelling in coding, math, writing, and visual perception. 2) The computer-use capability arrived with the GPT-5.4 update on March 5, 2026, representing what many analysts consider OpenAI's most significant product advancement since the original ChatGPT launch.
Unlike previous browser-automation approaches that relied on brittle scripts or browser-specific APIs, GPT-5.4 operates at the visual level:
The key innovation is that GPT-5.4 was trained end-to-end for computer use tasks — the capability is built into the model weights, not added as a plugin or post-training layer. The model supports both Playwright code generation for browser automation and direct mouse/keyboard commands from screenshots. 3)
Anthropic introduced computer use for Claude in October 2024, establishing the category. Key differences:
GPT-5.4 is available in three configurations: