OpenAI releases GPT-4, a multimodal AI that it claims is state-of-the-art

After months of anticipation, OpenAI has released a powerful new image- and text-understanding AI model, GPT-4, that the company calls “the latest milestone in its effort in scaling up deep learning.”

GPT-4 is available today via OpenAI’s API with a waitlist and in ChatGPT Plus, OpenAI’s premium plan for ChatGPT, its viral AI-powered chatbot.

It’s been hiding in plain sight, as it turns out. Microsoft confirmed today that Bing Chat, its chatbot tech co-developed with OpenAI, is running on GPT-4.

According to OpenAI, GPT-4 can accept image and text inputs — an improvement over GPT-3.5, its predecessor, which only accepted text — and performs at “human level” on various professional and academic benchmarks. For example, GPT-3 passes a simulated bar exam with a score around the top 10% of test takers.

OpenAI spent six months iteratively aligning GPT-4 using lessons from an adversarial testing program as well as ChatGPT, resulting in “best-ever results” on factuality, steerability and refusing to go outside of guardrails, according to the company.

“In a casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle,” OpenAI wrote in a blog post announcing GPT-4. “The difference comes out when the complexity of the task reaches a sufficient threshold — GPT-4 is more reliable, creative and able to handle much more nuanced instructions than GPT-3.5.”

Without a doubt, one of GPT-4’s more interesting aspects is its ability to understand images as well as text. GPT-4 can caption — and even interpret — relatively complex images, for example identifying a Lightning Cable adapter from a picture of a plugged-in iPhone.

The image understanding capability isn’t available to all OpenAI customers just yet — OpenAI’s testing it with a single partner, Be My Eyes, to start with. Powered by GPT-4, Be My Eyes’ new Virtual Volunteer feature can answer questions about images sent to it.

Be My Eyes explains how it works in a blog post:

“For example, if a user sends a picture of the inside of their refrigerator, the Virtual Volunteer will not only be able to correctly identify what’s in it, but also extrapolate and analyze what can be prepared with those ingredients. The tool can also then offer a number of recipes for those ingredients and send a step-by-step guide on how to make them.”

OpenAI releases GPT-4, a multimodal AI that it claims is state-of-the-art by Kyle Wiggers originally published on TechCrunch

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe to our Newsletter