By Emil Bjerg, journalist and editor
OpenAI has revealed several upgrades to GPT-4. Here’s what you need to know.
Weeks of speculation culminated on Monday when Mira Murati, CTO at OpenAI, led the company’s spring update. Contrary to speculations leading up to the event, the reveal was not an AI-powered search engine or a GPT-5. Instead, OpenAI has added several features to GPT-4, potentially changing how a lot of people interact and co-work with AI.
“The big news today is that we’re launching our new flagship model. And we’re calling it GPT-4o,” Murati announced.
With the roll-out of GPT-4o, users can expect upgrades in speed, user interaction, and pricing. Here’s what you need to know.
A desktop version and an updated app
OpenAI is releasing GPT-4o as a desktop app. OpenAI says that for “both free and paid users, we’re also launching a new ChatGPT desktop app for macOS that is designed to integrate seamlessly into anything you’re doing on your computer. With a simple keyboard shortcut (Option + Space), you can instantly ask ChatGPT a question.“
In other words, instead of having to switch tabs to get the help of ChatGPT, with the desktop app, users can have a continued dialogue that blends into various workflows.
Sound and vision
GPT-4o feels a lot more like an assistant in comparison to previous models. This is cemented with a voice feature that allows users to communicate in real-time with the chatbot via voice messages. As MIT Technology Review writes, GPT-4o “looks like a supercharged version of assistants like Siri or Alexa”. Where previous assistants like Siri and Alexa have been mocked for misunderstanding user intent, OpenAI showcased much more cohesive and complex interactions between chatbot and user.
Adding to the feeling of an assistant, GPT4-o will have a “sense of continuity”, as Murati said, allowing users to draw on past interactions with the chatbot.
Signaling a contrast with Google’s presentation of Gemini half a year ago, OpenAI made a point of emphasizing how all interactions with the upgraded GPT-4 were happening in real time. In the demo, Barret Zoph and Mark Chen, researchers at OpenAI, walked the audience through several use cases, from storytelling to math challenges.
The researchers had GPT-4o guide them in solving a simple equation simply by showing the chatbot handwritten notes. With the equation successfully solved, they highlighted that their new ‘omnipresent’ chatbot is more multimodal than the current GPT-4. “GPT-4o reasons across voice, text, and vision”, as Murati said. GPT-4o is ‘natively multimodal’, Sam Altman tweeted in parallel, meaning that it can seamlessly navigate between text, audio, and video input.
Perhaps most impressively, along with the relative seamless voice conversations, is GPT-4o’s ability to ‘understand’ emotions. From decoding a persons mood based on facial expressions to changing tonality in storytelling, the demo pointed to how GPT-4o might set new standards when it comes to AI-human interactions. This emotional intelligence might allow GPT-4o to engage in more empathetic and personalized conversations, valuable for various applications from personal assistance to customer service, and various social interactions.
Access and pricing
The new GPT-4o will roll out over the coming weeks and will be available in 50 languages, making it accessible to 97% of people globally, according to OpenAI.
Murati highlighted OpenAI’s drive to make its AI tools available to everyone for free. In the coming weeks, both ChatGPT-4 and custom GPTs, which allows users to spare with an AI on niche subjects, will be accessible free of charge. Users will also have access to the new features that come with GPT-4o.
There is a catch though. Paid users will have up to five times more capacity.
Relevant for developers, GPT-4o will also be available as an API. According to Sam Altman, GPT-4o’s API will be half the price, twice as fast, and have five times higher rate limits compared to GPT-4 Turbo.
Reception and Google’s I/O presentations
GPT-4o’s reception has generall been positive with MIT Technology Review highlighting its real-time reasoning capabilities across audio, vision, and text, and The Verge commenting that the update cures GPT-4 ‘laziness’.
OpenAI made sure to schedule their presentation just 24 hours ahead of Google’s I/O event, where they present recent developments in their work with AI. Google, despite having a dominant position in search, has been in a defensive position in the AI race lately. As the competition heats up, the presentations this week sets the stage for the continuation of the race to dominate AI.