OpenAI releases GPT-4, a multimodal AI that it claims is state-of-the-art

10:06 AM PDT • March 14, 2023

Sam Altman — **Image Credits:** Dani Padgett Watson (opens in a new window) / StrictlyVC

OpenAI has released a powerful new image- and text-understanding AI model, GPT-4, that the company calls “the latest milestone in its effort in scaling up deep learning.”

GPT-4 is available today to OpenAI’s paying users via ChatGPT Plus (with a usage cap), and developers can sign up on a waitlist to access the API.

Pricing is $0.03 per 1,000 “prompt” tokens (about 750 words) and $0.06 per 1,000 “completion” tokens (again, about 750 words). Tokens represent raw text; for example, the word “fantastic” would be split into the tokens “fan,” “tas” and “tic.” Prompt tokens are the parts of words fed into GPT-4 while completion tokens are the content generated by GPT-4.

GPT-4 has been hiding in plain sight, as it turns out. Microsoft confirmed today that Bing Chat, its chatbot tech co-developed with OpenAI, is running on GPT-4.

Other early adopters include Stripe, which is using GPT-4 to scan business websites and deliver a summary to customer support staff. Duolingo built GPT-4 into a new language learning subscription tier. Morgan Stanley is creating a GPT-4-powered system that’ll retrieve info from company documents and serve it up to financial analysts. And Khan Academy is leveraging GPT-4 to build some sort of automated tutor.

GPT-4’s new capabilities power a ‘virtual volunteer’ for the visually impaired

GPT-4 can generate text and accept image and text inputs — an improvement over GPT-3.5, its predecessor, which only accepted text — and performs at “human level” on various professional and academic benchmarks. For example, GPT-4 passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%.

OpenAI spent six months “iteratively aligning” GPT-4 using lessons from an internal adversarial testing program as well as ChatGPT, resulting in “best-ever results” on factuality, steerability and refusing to go outside of guardrails, according to the company. Like previous GPT models, GPT-4 was trained using publicly available data, including from public webpages, as well as data that OpenAI licensed.

OpenAI worked with Microsoft to develop a “supercomputer” from the ground up in the Azure cloud, which was used to train GPT-4.

“In a casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle,” OpenAI wrote in a blog post announcing GPT-4. “The difference comes out when the complexity of the task reaches a sufficient threshold — GPT-4 is more reliable, creative and able to handle much more nuanced instructions than GPT-3.5.”

Without a doubt, one of GPT-4’s more interesting aspects is its ability to understand images as well as text. GPT-4 can caption — and even interpret — relatively complex images, for example identifying a Lightning Cable adapter from a picture of a plugged-in iPhone.

The image understanding capability isn’t available to all OpenAI customers just yet — OpenAI’s testing it with a single partner, Be My Eyes, to start with. Powered by GPT-4, Be My Eyes’ new Virtual Volunteer feature can answer questions about images sent to it. The company explains how it works in a blog post:

“For example, if a user sends a picture of the inside of their refrigerator, the Virtual Volunteer will not only be able to correctly identify what’s in it, but also extrapolate and analyze what can be prepared with those ingredients. The tool can also then offer a number of recipes for those ingredients and send a step-by-step guide on how to make them.”

A more meaningful improvement in GPT-4, potentially, is the aforementioned steerability tooling. With GPT-4, OpenAI is introducing a new API capability, “system” messages, that allow developers to prescribe style and task by describing specific directions. System messages, which will also come to ChatGPT in the future, are essentially instructions that set the tone — and establish boundaries — for the AI’s next interactions.

For example, a system message might read: “You are a tutor that always responds in the Socratic style. You never give the student the answer, but always try to ask just the right question to help them learn to think for themselves. You should always tune your question to the interest and knowledge of the student, breaking down the problem into simpler parts until it’s at just the right level for them.”

Even with system messages and the other upgrades, however, OpenAI acknowledges that GPT-4 is far from perfect. It still “hallucinates” facts and makes reasoning errors, sometimes with great confidence. In one example cited by OpenAI, GPT-4 described Elvis Presley as the “son of an actor” — an obvious misstep.

“GPT-4 generally lacks knowledge of events that have occurred after the vast majority of its data cuts off (September 2021), and does not learn from its experience,” OpenAI wrote. “It can sometimes make simple reasoning errors which do not seem to comport with competence across so many domains, or be overly gullible in accepting obvious false statements from a user. And sometimes it can fail at hard problems the same way humans do, such as introducing security vulnerabilities into code it produces.”

OpenAI does note, though, that it made improvements in particular areas; GPT-4 is less likely to refuse requests on how to synthesize dangerous chemicals, for one. The company says that GPT-4 is 82% less likely overall to respond to requests for “disallowed” content compared to GPT-3.5 and responds to sensitive requests — e.g. medical advice and anything pertaining to self-harm — in accordance with OpenAI’s policies 29% more often.

OpenAI GPT-4 — **Image Credits:** OpenAI

There’s clearly a lot to unpack with GPT-4. But OpenAI, for its part, is forging full steam ahead — evidently confident in the enhancements it’s made.

“We look forward to GPT-4 becoming a valuable tool in improving people’s lives by powering many applications,” OpenAI wrote. “There’s still a lot of work to do, and we look forward to improving this model through the collective efforts of the community building on top of, exploring, and contributing to the model.”

More TechCrunch

The ups and downs of investing in Europe, with VCs Saul Klein and Raluca Ragab

Connie Loizos

4 hours ago

When it comes to the world of venture-backed startups, some issues are universal, and some are very dependent on where the startups and its backers are located. It’s something we…

The ups and downs of investing in Europe, with VCs Saul Klein and Raluca Ragab

Social

Scarlett Johansson brought receipts to the OpenAI controversy

Cody Corrall

7 hours ago

Welcome back to TechCrunch’s Week in Review — TechCrunch’s newsletter recapping the week’s biggest news. Want it in your inbox every Saturday? Sign up here. OpenAI announced this week that…

Scarlett Johansson brought receipts to the OpenAI controversy

Fundraising

Deal Dive: Can blockchain make weather forecasts better? WeatherXM thinks so

Rebecca Szkutak

12 hours ago

Accurate weather forecasts are critical to industries like agriculture, and they’re also important to help prevent and mitigate harm from inclement weather events or natural disasters. But getting forecasts right…

Deal Dive: Can blockchain make weather forecasts better? WeatherXM thinks so

Security

Spyware app pcTattletale was hacked and its website defaced

Zack Whittaker

12 hours ago

pcTattletale’s website was briefly defaced and contained links containing files from the spyware maker’s servers, before going offline.

Spyware app pcTattletale was hacked and its website defaced

Featured Article

Synapse, backed by a16z, has collapsed, and 10 million consumers could be hurt

Synapse’s bankruptcy shows just how treacherous things are for the often-interdependent fintech world when one key player hits trouble.

Mary Ann Azevedo

13 hours ago

Synapse, backed by a16z, has collapsed, and 10 million consumers could be hurt

Women in AI: Sarah Myers West says we should ask, ‘Why build AI at all?’

Kyle Wiggers

14 hours ago

Sarah Myers West, profiled as part of TechCrunch’s Women in AI series, is managing director at the AI Now institute.

Women in AI: Sarah Myers West says we should ask, ‘Why build AI at all?’

This Week in AI: OpenAI and publishers are partners of convenience

Kyle Wiggers

Devin Coldewey

14 hours ago

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world…

This Week in AI: OpenAI and publishers are partners of convenience

AI tutors are quietly changing how kids in the US study, and the leading apps are from China

Rita Liao

15 hours ago

Evan, a high school sophomore from Houston, was stuck on a calculus problem. He pulled up Answer AI on his iPhone, snapped a photo of the problem from his Advanced…

AI tutors are quietly changing how kids in the US study, and the leading apps are from China

Startups

Startups Weekly: Drama at Techstars. Drama in AI. Drama everywhere.

Haje Jan Kamps

1 day ago

Welcome to Startups Weekly — Haje‘s weekly recap of everything you can’t miss from the world of startups. Sign up here to get it in your inbox every Friday. Well,…

Startups

From Plaid to Figma, here are the startups that are likely — or definitely — not having IPOs this year

Rebecca Szkutak

1 day ago

Last year’s investor dreams of a strong 2024 IPO pipeline have faded, if not fully disappeared, as we approach the halfway point of the year. 2024 delivered four venture-backed tech…

From Plaid to Figma, here are the startups that are likely — or definitely — not having IPOs this year

Transportation

Feds add nine more incidents to Waymo robotaxi investigation

Kirsten Korosec

1 day ago

Federal safety regulators have discovered nine more incidents that raise questions about the safety of Waymo’s self-driving vehicles operating in Phoenix and San Francisco. The National Highway Traffic Safety Administration…

Feds add nine more incidents to Waymo robotaxi investigation

Fundraising

Pitch Deck Teardown: Terra One’s $7.5M Seed deck

Haje Jan Kamps

1 day ago

Terra One’s pitch deck has a few wins, but also a few misses. Here’s how to fix that.

Pitch Deck Teardown: Terra One’s $7.5M Seed deck

Women in AI: Chinasa T. Okolo researches AI’s impact on the Global South

Dominic-Madori Davis

1 day ago

Chinasa T. Okolo researches AI policy and governance in the Global South.

Women in AI: Chinasa T. Okolo researches AI’s impact on the Global South

Disrupt 2024 early-bird tickets fly away next Friday

TechCrunch Events

1 day ago

TechCrunch Disrupt takes place on October 28–30 in San Francisco. While the event is a few months away, the deadline to secure your early-bird tickets and save up to $800…

Disrupt 2024 early-bird tickets fly away next Friday

Big tech companies are plowing money into AI startups, which could help them dodge antitrust concerns

Paul Sawers

2 days ago

Another week, and another round of crazy cash injections and valuations emerged from the AI realm. DeepL, an AI language translation startup, raised $300 million on a $2 billion valuation;…

Big tech companies are plowing money into AI startups, which could help them dodge antitrust concerns

Venture

Harlem Capital is raising a $150 million fund

Dominic-Madori Davis

2 days ago

If raised, this new fund, the firm’s third, would be its largest to date.

Security

US pharma giant Cencora says Americans’ health information stolen in data breach

Zack Whittaker

2 days ago

About half a million patients have been notified so far, but the number of affected individuals is likely far higher.

US pharma giant Cencora says Americans’ health information stolen in data breach

Last day to vote for TC Disrupt 2024 Audience Choice program

TechCrunch Events

2 days ago

Attention, tech enthusiasts and startup supporters! The final countdown is here: Today is the last day to cast your vote for the TechCrunch Disrupt 2024 Audience Choice program. Voting closes…

Last day to vote for TC Disrupt 2024 Audience Choice program

Featured Article

Signal’s Meredith Whittaker on the Telegram security clash and the ‘edge lords’ at OpenAI

Among other things, Whittaker is concerned about the concentration of power in the five main social media platforms.

Mike Butcher

2 days ago

Signal’s Meredith Whittaker on the Telegram security clash and the ‘edge lords’ at OpenAI

Transportation

Lucid Motors slashes 400 jobs ahead of crucial SUV launch

Sean O'Kane

2 days ago

Lucid Motors is laying off about 400 employees, or roughly 6% of its workforce, as part of a restructuring ahead of the launch of its first electric SUV later this…

Lucid Motors slashes 400 jobs ahead of crucial SUV launch

Startups

Google invests $350 million in Indian e-commerce giant Flipkart

Manish Singh

2 days ago

Google is investing nearly $350 million in Flipkart, becoming the latest high-profile name to back the Walmart-owned Indian e-commerce startup. The Android-maker will also provide Flipkart with cloud offerings as…

Google invests $350 million in Indian e-commerce giant Flipkart

Enterprise

Jio Financial unit to buy $4.32B of telecom gear from Reliance Retail

Manish Singh

2 days ago

A Jio Financial unit plans to purchase customer premises equipment and telecom gear worth $4.32 billion from Reliance Retail.

Jio Financial unit to buy $4.32B of telecom gear from Reliance Retail

Apps

Foursquare just laid off 105 employees

Connie Loizos

2 days ago

Foursquare, the location-focused outfit that in 2020 merged with Factual, another location-focused outfit, is joining the parade of companies to make cuts to one of its biggest cost centers –…

Using memes, social media users have become red teams for half-baked AI features

Amanda Silberling

2 days ago

“Running with scissors is a cardio exercise that can increase your heart rate and require concentration and focus,” says Google’s new AI search feature. “Some say it can also improve…

Using memes, social media users have become red teams for half-baked AI features

Space

ESA prepares for the post-ISS era, selects The Exploration Company, Thales Alenia to develop cargo spacecraft

Aria Alamalhodaei

2 days ago

The European Space Agency selected two companies on Wednesday to advance designs of a cargo spacecraft that could establish the continent’s first sovereign access to space. The two awardees, major…

ESA prepares for the post-ISS era, selects The Exploration Company, Thales Alenia to develop cargo spacecraft

Startups

Expressable brings speech therapy into the home

Kyle Wiggers

2 days ago

Expressable is a platform that offers one-on-one virtual sessions with speech language pathologists.

Expressable brings speech therapy into the home

Startups

The biggest French startups in 2024 according to the French government

Anna Heim

2 days ago

The French Secretary of State for the Digital Economy as of this year, Marina Ferrari, revealed this year’s laureates during VivaTech week in Paris. According to its promoters, this fifth…

The biggest French startups in 2024 according to the French government

Hardware

Spotify to shut off Car Thing for good, leading users to demand refunds

Aisha Malik

2 days ago

Spotify is notifying customers who purchased its Car Thing product that the devices will stop working after December 9, 2024. The company discontinued the device back in July 2022, but…

Spotify to shut off Car Thing for good, leading users to demand refunds

Social

X should bring back stars, not hide ‘likes’

Sarah Perez

2 days ago

Elon Musk’s X is preparing to make “likes” private on the social network, in a change that could potentially confuse users over the difference between something they’ve favorited and something…

X should bring back stars, not hide ‘likes’

$6M fine for robocaller who used AI to clone Biden’s voice

Devin Coldewey

2 days ago

The FCC has proposed a $6 million fine for the scammer who used voice-cloning tech to impersonate President Biden in a series of illegal robocalls during a New Hampshire primary…

OpenAI releases GPT-4, a multimodal AI that it claims is state-of-the-art

More TechCrunch

Get the industry’s biggest tech news

TechCrunch Daily News

Startups Weekly

TechCrunch Fintech

TechCrunch Mobility

Tags