New OpenAI models "think" with images
Add Axios as your preferred source to
see more of our stories on Google.

Illustration: Aïda Amer/Axios
OpenAI on Wednesday released two new AI models — o3 and o4-mini — designed to handle a broader range of tasks, from coding to visual analysis.
Why it matters: OpenAI says o3 is its most advanced reasoning model yet and the first in its series of models to handle web browsing, image generation and visual understanding.
Driving the news: OpenAI said it was immediately releasing o3 as well as o4-mini, which it bills as a "smaller, faster model that delivers impressive results — especially in math, coding, and visual tasks — at lower cost."
- The integration of web browsing and image capabilities into o3 "helps them solve complex, multi-step problems more effectively and take real steps toward acting independently," OpenAI said.
- "OpenAI o3 and o4-mini are our first models that can think with images — meaning they don't just see an image, they can integrate visual information directly into their reasoning chain."
- Both models will be available imminently for ChatGPT Plus, Pro, and Team users, with o3-pro coming "in a few weeks."
- OpenAI is also debuting Codex CLI, which it describes as "a lightweight, open-source coding agent that runs locally" in a computer's terminal app and works with o3 and o4-mini.
Catch up quick: After initially planning a standalone release for o3, OpenAI said in February that it would merge o3 into the release of GPT-5, but then reversed course again earlier this month, with CEO Sam Altman saying o3 and o4-mini would soon be released, with GPT-5 to follow "in a few months."
The intrigue: The models were both evaluated using the revised preparedness framework that OpenAI announced on Tuesday.
- OpenAI said that the model was not scored for its persuasion capabilities, since that metric is no longer included under the new framework; however OpenAI stressed that it does look at model persuasiveness as part of its broader safety work.
