cua cà mau cua tươi sống cua cà mau bao nhiêu 1kg giá cua hôm nay giá cua cà mau hôm nay cua thịt cà mau cua biển cua biển cà mau cách luộc cua cà mau cua gạch cua gạch cà mau vựa cua cà mau lẩu cua cà mau giá cua thịt cà mau hôm nay giá cua gạch cà mau giá cua gạch cách hấp cua cà mau cua cốm cà mau cua hấp mua cua cà mau cua ca mau ban cua ca mau cua cà mau giá rẻ cua biển tươi cuaganic cua cua thịt cà mau cua gạch cà mau cua cà mau gần đây hải sản cà mau cua gạch son cua đầy gạch giá rẻ các loại cua ở việt nam các loại cua biển ở việt nam cua ngon cua giá rẻ cua gia re crab farming crab farming cua cà mau cua cà mau cua tươi sống cua tươi sống cua cà mau bao nhiêu 1kg giá cua hôm nay giá cua cà mau hôm nay cua thịt cà mau cua biển cua biển cà mau cách luộc cua cà mau cua gạch cua gạch cà mau vựa cua cà mau lẩu cua cà mau giá cua thịt cà mau hôm nay giá cua gạch cà mau giá cua gạch cách hấp cua cà mau cua cốm cà mau cua hấp mua cua cà mau cua ca mau ban cua ca mau cua cà mau giá rẻ cua biển tươi cuaganic cua cua thịt cà mau cua gạch cà mau cua cà mau gần đây hải sản cà mau cua gạch son cua đầy gạch giá rẻ các loại cua ở việt nam các loại cua biển ở việt nam cua ngon cua giá rẻ cua gia re crab farming crab farming cua cà mau
Skip to main content

ChatGPT’s latest model may be a regression in performance

chatGPT on a phone on an encyclopedia
Shantanu Kumar / Pexels
According to a new report from Artificial Analysis, OpenAI’s flagship large language model for ChatGPT, GPT-4o, has significantly regressed in recent weeks, putting the state-of-the-art model’s performance on par with the far smaller, and notably less capable, GPT-4o-mini model.

This analysis comes less than 24 hours after the company announced an upgrade for the GPT-4o model. “The model’s creative writing ability has leveled up–more natural, engaging, and tailored writing to improve relevance & readability,” OpenAI wrote on X. “It’s also better at working with uploaded files, providing deeper insights & more thorough responses.” Whether those claims continue to hold up is now being cast in doubt.

Recommended Videos

“We have completed running our independent evals on OpenAI’s GPT-4o release yesterday and are consistently measuring materially lower eval scores than the August release of GPT-4o,” the Artificial Analysis announced via an X post on Thursday, noting that the model’s Artificial Analysis Quality Index decreased from 77 to 71 (and is now equal to that of GPT-4o mini).

What’s more, GPT-4o’s performance on the GPQA Diamond benchmark decreased from 51% to 39% while its MATH benchmarks decreased from 78% to 69%.

Simultaneously, the researchers discovered more than a doubling in the speed increase of the model’s responses, accelerating from around 80 output tokens per second to roughly 180 tokens/s. “We have generally observed significantly faster speeds on launch day for OpenAI models (likely due to OpenAI provisioning capacity ahead of adoption), but previously have not seen a 2x speed difference,” the researchers wrote.

Wait – is the new GPT-4o a smaller and less intelligent model?

We have completed running our independent evals on OpenAI’s GPT-4o release yesterday and are consistently measuring materially lower eval scores than the August release of GPT-4o.

GPT-4o (Nov) vs GPT-4o (Aug):
➤… pic.twitter.com/gjY2pBFuUv

— Artificial Analysis (@ArtificialAnlys) November 21, 2024

“Based on this data, we conclude that it is likely that OpenAI’s Nov 20th GPT-4o model is a smaller model than the August release,” they continued. “Given that OpenAI has not cut prices for the Nov 20th version, we recommend that developers do not shift workloads away from the August version without careful testing.”

GPT-4o was first released in May 2024 to surpass the existing GPT-3.5 and GPT-4 models. GPT-4o offers state-of-the-art benchmark results in voice, multilingual, and vision tasks, according to OpenAI, making it ideal for advanced applications like real-time translation and conversational AI.

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
ChatGPT Search is here to battle both Google and Perplexity
The ChatGPT Search icon on the prompt window

ChatGPT is receiving its second new search feature of the week, the company announced on Thursday. Dubbed ChatGPT Search, this tool will deliver real-time data from the internet in response to your chat prompts.

ChatGPT Search appears to be both OpenAI's answer to Perplexity and a shot across Google's bow.

Read more
ChatGPT’s Advanced Voice Mode just came to PCs and Macs
ChatGPT Advanced Voice Mode Desktop app

You can now speak directly with ChatGPT right on your PC or Mac, thanks to a new Advanced Voice Mode integration, OpenAI announced on Wednesday. "Big day for desktops," the company declared in an X (formerly Twitter) post.

Advanced Voice Mode (AVM) runs atop the GPT-4o model, OpenAI's current state of the art, and enables the user to speak to the chatbot without the need for text prompts.

Read more
Your ChatGPT conversation history is now searchable
ChatGPT chat search

OpenAI debuted a new way to more efficiently manage your growing ChatGPT chat history on Tuesday: a search function for the web app. With it, you'll be able to quickly surface previous references and chats to cite within your current ChatGPT conversation.

"We’re starting to roll out the ability to search through your chat history on ChatGPT web," the company announced via a post on X (formerly Twitter). "Now you can quickly & easily bring up a chat to reference, or pick up a chat where you left off."

Read more