OpenAI slashes AI costs with high-performance GPT-4o mini

3 months ago 53

Ryan Daws is a senior editor at TechForge Media with over a decade of experience in crafting compelling narratives and making complex topics accessible. His articles and interviews with industry leaders have earned him recognition as a key influencer by organisations like Onalytica. Under his leadership, publications have been praised by analyst firms such as Forrester for their excellence and performance. Connect with him on X (@gadget_ry) or Mastodon (@gadgetry@techhub.social)

OpenAI has announced GPT-4o mini, a small model designed to make AI more accessible and affordable for developers. This new addition to the GPT family promises superior performance at a fraction of the cost of previous models, including the regular GPT-4o .

GPT-4o mini still boasts impressive capabilities, scoring 82% on the MMLU benchmark and outperforming GPT-4 on chat preferences in the LMSYS leaderboard. The model is priced at just 15 cents per million input tokens and 60 cents per million output tokens, making it significantly more cost-effective than its predecessors.

Key features:

Low cost and latency
128K token context window
Up to 16K output tokens per request
Knowledge cutoff: October 2023
Improved tokeniser for efficient non-English text handling
Support for text and vision in the API (with future expansion to include video and audio)

GPT-4o mini outshines other small models across various benchmarks:

MMLU (textual intelligence): 82.0%
MGSM (math reasoning): 87.0%
HumanEval (coding performance): 87.2%
MMMU (multimodal reasoning): 59.4%

These scores demonstrate GPT-4o mini’s superiority in reasoning tasks, math, coding, and multimodal understanding compared to competitors like Gemini Flash and Claude Haiku.

Developers can leverage GPT-4o mini for a wide range of applications, including:

Chaining or parallelising multiple model calls
Passing large volumes of context (e.g., full code bases or conversation histories)
Building real-time text response systems (e.g., customer support chatbots)

OpenAI has prioritised safety in GPT-4o mini’s development by implementing pre-training content filtering, post-training alignment using techniques like RLHF, and an innovative “instruction hierarchy” method to resist jailbreaks and prompt injections.

GPT-4o mini is now accessible through the Assistants API, Chat Completions API, and Batch API. Developers can expect to pay 15 cents per 1M input tokens and 60 cents per 1M output tokens. Fine-tuning capabilities are set to roll out in the coming days.

“We envision a future where models become seamlessly integrated in every app and on every website. GPT-4o mini is paving the way for developers to build and scale powerful AI applications more efficiently and affordably,” OpenAI explains.

As AI continues to evolve, GPT-4o mini is a step towards making advanced language models more accessible to developers of all backgrounds. With its impressive performance and cost-effectiveness, this new model will help to unlock a new era of AI-powered applications and services while we eagerly await GPT-5.

(Image Credit: OpenAI)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: AI, artificial intelligence, coding, development, gpt-4o, gpt-4o mini, openai, programming