OpenAI Releases GPT-5 With Enhanced Reasoning Capabilities

SAN FRANCISCO — OpenAI on Tuesday announced the release of GPT-5, its most powerful language model to date, claiming the system achieves human-level performance on a broad range of professional benchmarks including the bar exam, the medical licensing exam, and the United States Physics Olympiad.

The model, which the company says was trained on roughly 10 trillion tokens of text and code, demonstrates a 40 percent improvement in multi-step reasoning tasks compared with its predecessor, GPT-4. OpenAI chief executive Sam Altman described GPT-5 as "a genuine step toward systems that can act as expert collaborators across every domain of knowledge."

In independent testing conducted by researchers at MIT and Stanford, GPT-5 scored in the 95th percentile on the Graduate Record Examination (GRE) and achieved a 92 percent accuracy rate on a curated dataset of 5,000 factual questions drawn from Wikipedia. The model also reduced hallucination rates by approximately 60 percent relative to GPT-4, according to the company's internal evaluations.

Access to GPT-5 will be tiered: free-tier users will receive 10 messages per day using a compressed version of the model, while ChatGPT Plus subscribers paying $20 per month will have unlimited access to the full model. Enterprise customers, who pay a minimum of $30 per user per month, will gain access to an extended-context version that supports up to 200,000 tokens per conversation — enough to process a 150,000-word novel in a single session.

The announcement comes as competition in the AI sector has intensified. Google's Gemini Ultra and Anthropic's Claude 3 Opus have both posted strong benchmark numbers in recent months, and Meta released the open-source Llama 3 model in April, which reached performance parity with GPT-4 on several tasks. Despite this, OpenAI maintains that GPT-5 sets a new state of the art across the majority of published evaluations.

Safety testing for GPT-5 took approximately eight months and involved more than 200 red-team researchers across academia and industry. The company says it identified and mitigated 14 high-severity risk categories before deployment, including generation of detailed instructions for synthesizing chemical weapons and systematic manipulation of democratic elections. OpenAI is releasing a 47-page safety card alongside the model detailing the evaluation methodology and residual risks.
