Featured Article

The Great Pretender

AI doesn’t know the answer, and it hasn’t learned how to care

Comment

Hands holding a mask of anonymity. Polygonal design of interconnected elements.
Image Credits: llya Lukichev / Getty / Getty Images

There is a good reason not to trust what today’s AI constructs tell you, and it has nothing to do with the fundamental nature of intelligence or humanity, with Wittgensteinian concepts of language representation, or even disinfo in the dataset. All that matters is that these systems do not distinguish between something that is correct and something that looks correct. Once you understand that the AI considers these things more or less interchangeable, everything makes a lot more sense.

Now, I don’t mean to short circuit any of the fascinating and wide-ranging discussions about this happening continually across every form of media and conversation. We have everyone from philosophers and linguists to engineers and hackers to bartenders and firefighters questioning and debating what “intelligence” and “language” truly are, and whether something like ChatGPT possesses them.

This is amazing! And I’ve learned a lot already as some of the smartest people in this space enjoy their moment in the sun, while from the mouths of comparative babes come fresh new perspectives.

But at the same time, it’s a lot to sort through over a beer or coffee when someone asks “what about all this GPT stuff, kind of scary how smart AI is getting, right?” Where do you start — with Aristotle, the mechanical Turk, the perceptron or “Attention is all you need”?

During one of these chats I hit on a simple approach that I’ve found helps people get why these systems can be both really cool and also totally untrustable, while subtracting not at all from their usefulness in some domains and the amazing conversations being had around them. I thought I’d share it in case you find the perspective useful when talking about this with other curious, skeptical people who nevertheless don’t want to hear about vectors or matrices.

There are only three things to understand, which lead to a natural conclusion:

  1. These models are created by having them observe the relationships between words and sentences and so on in an enormous dataset of text, then build their own internal statistical map of how all these millions and millions of words and concepts are associated and correlated. No one has said, this is a noun, this is a verb, this is a recipe, this is a rhetorical device; but these are things that show up naturally in patterns of usage.
  2. These models are not specifically taught how to answer questions, in contrast to the familiar software companies like Google and Apple have been calling AI for the last decade. Those are basically Mad Libs with the blanks leading to APIs: Every question is either accounted for or produces a generic response. With large language models the question is just a series of words like any other.
  3. These models have a fundamental expressive quality of “confidence” in their responses. In a simple example of a cat recognition AI, it would go from 0, meaning completely sure that’s not a cat, to 100, meaning absolutely sure that’s a cat. You can tell it to say “yes, it’s a cat” if it’s at a confidence of 85, or 90, whatever produces your preferred response metric.

So given what we know about how the model works, here’s the crucial question: What is it confident about? It doesn’t know what a cat or a question is, only statistical relationships found between data nodes in a training set. A minor tweak would have the cat detector equally confident the picture showed a cow, or the sky, or a still life painting. The model can’t be confident in its own “knowledge” because it has no way of actually evaluating the content of the data it has been trained on.

The AI is expressing how sure it is that its answer appears correct to the user.

This is true of the cat detector, and it is true of GPT-4 — the difference is a matter of the length and complexity of the output. The AI cannot distinguish between a right and wrong answer — it only can make a prediction of how likely a series of words is to be accepted as correct. That is why it must be considered the world’s most comprehensively informed bullshitter rather than an authority on any subject. It doesn’t even know it’s bullshitting you — it has been trained to produce a response that statistically resembles a correct answer, and it will say anything to improve that resemblance.

The AI doesn’t know the answer to any question, because it doesn’t understand the question. It doesn’t know what questions are. It doesn’t “know” anything! The answer follows the question because, extrapolating from its statistical analysis, that series of words is the most likely to follow the previous series of words. Whether those words refer to real places, people, locations, etc. is not material — only that they are like real ones.

It’s the same reason AI can produce a Monet-like painting that isn’t a Monet — all that matters is it has all the characteristics that cause people to identify a piece of artwork as his. Today’s AI approximates factual responses the way it would approximate “Water Lilies.”

Now, I hasten to add that this isn’t an original or groundbreaking concept — it’s basically another way to explain the stochastic parrot, or the undersea octopus. Those problems were identified very early by very smart people and represent a great reason to read commentary on tech matters widely.

Ethicists fire back at ‘AI Pause’ letter they say ‘ignores the actual harms’

But in the context of today’s chatbot systems, I’ve just found that people intuitively get this approach: The models don’t understand facts or concepts, but relationships between words, and its responses are an “artist’s impression” of an answer. Their goal, when you get down to it, is to fill in the blank convincingly, not correctly. This is the reason why its responses fundamentally cannot be trusted.

Of course sometimes, even a lot of the time, its answer is correct! And that isn’t an accident: For many questions, the answer that looks the most correct is the correct answer. That is what makes these models so powerful — and dangerous. There is so, so much you can extract from a systematic study of millions of words and documents. And unlike recreating “Water Lilies” exactly, there’s a flexibility to language that lets an approximation of a factual response also be factual — but also make a totally or partially invented response appear equally or more so. The only thing the AI cares about is that the answer scans right.

This leaves the door open to discussions around whether this is truly knowledge, what if anything the models “understand,” if they have achieved some form of intelligence, what intelligence even is and so on. Bring on the Wittgenstein!

Furthermore, it also leaves open the possibility of using these tools in situations where truth isn’t really a concern. If you want to generate five variants of an opening paragraph to get around writer’s block, an AI might be indispensable. If you want to make up a story about two endangered animals, or write a sonnet about Pokémon, go for it. As long as it is not crucial that the response reflects reality, a large language model is a willing and able partner — and not coincidentally, that’s where people seem to be having the most fun with it.

Where and when AI gets it wrong is very, very difficult to predict because the models are too large and opaque. Imagine a card catalog the size of a continent, organized and updated over a period of a hundred years by robots, from first principles that they came up with on the fly. You think you can just walk in and understand the system? It gives a right answer to a difficult question and a wrong answer to an easy one. Why? Right now that is one question that neither AI nor its creators can answer.

This may well change in the future, perhaps even the near future. Everything is moving so quickly and unpredictably that nothing is certain. But for the present this is a useful mental model to keep in mind: The AI wants you to believe it and will say anything to improve its chances.

More TechCrunch

Featured Article

UK’s Zapp EV plans to expand globally with an early start in India

Zapp is launching its urban electric two-wheeler in India in 2025 as it plans to expand globally.

UK’s Zapp EV plans to expand globally with an early start in India

The first time I saw Google’s latest commercial, I wondered, “Is it just me, or is this kind of bad?” By the fourth or fifth time I saw it, I’d…

Dear Google, who wants an AI-written fan letter?

Featured Article

MatPat, the first big YouTuber to successfully exit his company, is lobbying for creators on Capitol Hill

Though MatPat retired from YouTube, he’s still pretty busy. In fact, he’s been spending a lot of time on Capitol Hill.

MatPat, the first big YouTuber to successfully exit his company, is lobbying for creators on Capitol Hill

Featured Article

A tale of two foldables

Samsung is still foldables’ 500-pound gorilla, but the company successes have made the category significantly less lonely in recent years.

A tale of two foldables

The California Department of Motor Vehicles this week granted Nuro approval to test its third-generation R3 autonomous delivery vehicle in four Bay Area cities, giving the AV startup a positive…

Autonomous delivery startup Nuro is gearing up for a comeback

With Ghostery turning 15 years old this month, TechCrunch caught up with CEO Jean-Paul Schmetz to discuss the company’s strategy and the state of ad tracking.

Ghostery’s CEO says regulation won’t save us from ad trackers

Two years ago, workers at an Apple Store in Towson, Maryland were the first to establish a formally recognized union at an Apple retail store in the United States. Now…

Apple reaches its first contract agreement with a US retail union

OpenAI is testing SearchGPT, a new AI search experience to compete directly with Google. The feature aims to elevate search queries with “timely answers” from across the internet and allows…

OpenAI comes for Google with SearchGPT

Indian cryptocurrency exchange WazirX announced on Saturday a controversial plan to “socialize” the $230 million loss from its recent security breach among all its customers, a move that has sent…

WazirX to ‘socialize’ $230 million security breach loss among customers

Featured Article

Stay up-to-date on the amount of venture dollars going to underrepresented founders

Stay up-to-date on the latest funding news for Black and women founders.

Stay up-to-date on the amount of venture dollars going to underrepresented founders

The National Institute of Standards and Technology (NIST), the U.S. Commerce Department agency that develops and tests tech for the U.S. government, companies and the broader public, has re-released a…

NIST releases a tool for testing AI model risk

Featured Article

Max Space reinvents expandable habitats with a 17th-century twist, launching in 2026

Max Space’s expandable habitats promise to be larger, stronger, and more versatile than anything like them ever launched, not to mention cheaper and lighter by far than a solid, machined structure.

Max Space reinvents expandable habitats with a 17th-century twist, launching in 2026

Payments giant Stripe has acquired a four-year-old competitor, Lemon Squeezy, the latter company announced Friday. Terms of the deal were not disclosed. As a merchant of record, Lemon Squeezy calculates…

Stripe acquires payment processing startup Lemon Squeezy

iCloud Private Relay has not been working for some Apple users across major markets, including the U.S., Europe, India and Japan.

Apple reports iCloud Private Relay global outages for some users

Welcome to Startups Weekly — your weekly recap of everything you can’t miss from the world of startups. To get Startups Weekly in your inbox every Friday, sign up here. This…

Legal tech, VC brawls and saying no to big offers

Apple joins 15 other tech companies — including Google, Meta, Microsoft and OpenAI — that committed to the White House’s rules for developing generative AI.

Apple signs the White House’s commitment to AI safety

The language is ambiguous, so it’s not clear whether X is helping itself to all user data for training Grok or whether this processing refers only to user interactions with…

Privacy watchdog says it’s ‘surprised’ by Elon Musk opting user data into Grok AI training

Sound Search on TikTok is somewhat similar to YouTube Music’s song detection tool that lets you find the name of a song by singing, humming or playing it. 

TikTok rolls out a new feature that lets you find songs by singing or humming them

Skip, a wearable tech startup that began as a secretive project inside Alphabet, exited stealth this week to announce a partnership with outdoor clothing specialist Arc’teryx. The deal is the…

Alphabet X spinoff partners with Arc’teryx to bring ‘everyday’ exoskeleton to market

Ledger, a French startup mostly known for its secure crypto hardware wallets, has launched a new mid-range device, the Ledger Flex. Available now, priced at $249, the dinky hardware wallet…

Ledger launches Ledger Flex, a mid-range hardware crypto wallet

The good news is that you can switch off the new data-sharing setting and also delete your conversation history with the AI. 

Here’s how to disable X (Twitter) from using your data to train its Grok AI

Regulators gave SpaceX the all-clear to return to launch two weeks after the Falcon 9 rocket experienced an anomaly on orbit.

SpaceX cleared to resume Falcon 9 launches while FAA investigation remains open

Madison Long and Simone May founded Clutch in 2020 to help connect people to businesses looking for marketing and content creation.

Digital marketing startup Plaiced has acquired Precursor Ventures-backed Clutch

With the CrowdStrike update continuing to cause havoc across the planet, a startup has raised $13.5 million to at least improve some level of security for the kinds of devices…

ZeroTier raises $13.5M to help avert CrowdStrike-like network problems

Apple has reduced prices of its iPhone models in India by 3-4% following a cut in import duties in the South Asian market.

Apple cuts iPhone price in India amid China slowdown

MNT-Halan, a fintech unicorn out of Egypt, is on a consolidation march. The microfinance and payments startup has raised $157.5 million in funding and is using the money in part…

Egypt’s MNT-Halan banks $157.5M, gobbles up a fintech in Turkey to expand

The energy transition is a marathon, not a sprint. But opportunities for acceleration are growing. Swedish startup Greenely* has just spotted one. It’s closing an €8 million Series A funding…

Energy tech startup Greenely grabs €8M to reach more households and support Europe’s energy transition

The Floorr offers tools for conducting sales, hosting tailored styling sessions, creating mood boards, and engaging in text or voice chats with clients, all in one place. 

Luxury fashion startup The Floorr empowers personal stylists with tools to grow their businesses

A decade-old drama involving VC David Sacks and Rippling founder Parker Conrad has blown up on X with many among the Silicon Valley elite taking sides.

Here’s why David Sacks, Paul Graham and other big Silicon Valley names had a brawl on X over VC behavior

ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm since its launch in November 2022. What started as a tool to hyper-charge productivity through writing essays and code…

ChatGPT: Everything you need to know about the AI-powered chatbot