TechOpenAI's GPT-4o: Revolutionizing AI with instant sound and image analysis

OpenAI's GPT‑4o: Revolutionizing AI with instant sound and image analysis

OpenAI has unveiled its latest achievement—the GPT-4o model, which can analyse sound, image, and text in real time. Surprisingly, the model demonstrates an extraordinary speed of reaction to received sound signals.

OpenAI's GPT-4o: Revolutionizing AI with instant sound and image analysis
Images source: © Unsplash

14 May 2024 09:14

Artificial intelligence enthusiasts eagerly awaited the OpenAI Spring Update - a presentation by the creators of ChatGPT. The mood before the event was heightened by loud industry announcements about a possible presentation of a new AI technology-based internet search engine. However, this time, the focus was on the latest model.

GPT-4o operates in real-time

OpenAI introduced the GPT-4o model, which enables more natural interactions. According to the company's statements, GPT-4o responds to sound signals in a quarter of a second, averaging a reply response time of around a third. This is comparable to the time it takes for a conversation with a human. Regarding performance, the model is similar to GPT-4 Turbo when analysing text in English and performs even better in other languages.

OpenAI claims that its new GPT-4o model is also significantly better at interpreting images and sounds than models currently available. So, what are the capabilities of this new tool? One of the moments that made the biggest impression on me was a recording in which GPT-4o was asked to start counting from one to ten.

GPT-4o's reaction to commands to change the pace was instantaneous, happening in real-time. Equally interesting was another recording in which GPT-4o took on the role of a Spanish language teacher, analysing objects seen through the camera.

When can we expect access to GPT-4o? OpenAI informs that text and graphic functions of the GPT-4o model have already become available today in ChatGPT. The new model is free, and subscription users can benefit from up to five times increased message limits. OpenAI also plans to introduce a new version of the GPT-4o voice mode in an alpha version for ChatGPT Plus users in the coming weeks.

Remember, OpenAI isn't just ChatGPT. The upcoming Sora model will allow users to create videos, which many creators are particularly excited about.

© Daily Wrap
·

Downloading, reproduction, storage, or any other use of content available on this website—regardless of its nature and form of expression (in particular, but not limited to verbal, verbal-musical, musical, audiovisual, audio, textual, graphic, and the data and information contained therein, databases and the data contained therein) and its form (e.g., literary, journalistic, scientific, cartographic, computer programs, visual arts, photographic)—requires prior and explicit consent from Wirtualna Polska Media Spółka Akcyjna, headquartered in Warsaw, the owner of this website, regardless of the method of exploration and the technique used (manual or automated, including the use of machine learning or artificial intelligence programs). The above restriction does not apply solely to facilitate their search by internet search engines and uses within contractual relations or permitted use as specified by applicable law.Detailed information regarding this notice can be found  here.