Uncomplicating AI

A short, sweet guide to why many have soured on the AI craze

David Jones-Krause

25 Aug 2024 — 5 min read

Ads for generative AI are everywhere. Although there’s nothing all that “intelligent” about Large Language Models (LLMs) like ChatGPT, they’re being thrown into every website builder and wearable device. Despite that—or perhaps in service of it—no one bothers to explain in simple terms how LLMs work. The rather opaque Wikipedia definition cites an OpenAI blog from 2019 as its sole and primary source:

“ A large language model (LLM) is a computational model capable of language generation or other natural language processing tasks. As language models, LLMs acquire these abilities by learning statistical relationships from vast amounts of text during a self-supervised and semi-supervised training process.”

Putting aside the fact that OpenAI is an authority on developing LLMs in the same way that Wal-Mart is an authority on developing supply chains, I think I can offer a simpler explanation—

You know that prediction bar your phone uses to guess what your next word is? Have you noticed how it improves with use, learning your style, your preferred word choices? Imagine that kind of guessing, but being done long-form, and being guided by basically every word on the internet and not just your own indiscriminate use of the phrase “cool beans.”

Imitation is the Sincerest Form of Plagiarism

You might be asking “But David, if it’s built from existing news articles, blogposts, and artwork, wouldn’t the creators of those original works be owed credit and/or compensation?” The artists behind a class-action lawsuit against AI art generation platforms like Stable Diffusion & Midjourney would be inclined to answer “yes.” So would the prolific authors insistent that ChatGPT was trained on their published works/intellectual property. The reality is, whether or not these specific cases are successful, we know with relative certainty that virtually all large-scale LLMs are trained using an enormous volume of ill-gotten intellectual property. How we choose to legislate around this truth has society-altering consequences.

I wonder whether Getty Images were used to develop Stable Diffusion's model...

That’s Personal

The information being used to train LLMs isn’t limited to what people choose to sell or broadcast to the wider world. Whether you look to Google in how Bard uses your private message history or to OpenAI in how ChatGPT was manipulated to leak personally identifiable information(PII), the writing is on the wall, or rather, the writing is in the dataset. The reality is, you should operate with the assumption that until we have legislative guard rails, all non-encrypted virtual correspondence has a significant chance of winding up within some company’s training data. As with Microsoft’s new “Recall” feature, the security risk of this shifting paradigm is potentially catastrophic, both in how many people are likely to be affected and in the volume of their private information being (not so) covertly siphoned away and stored without being properly scrubbed of PII.

Unlike eras past where companies like Google have made an effort to protect PII, it may be legitimately impossible to bring existing models like ChatGPT into compliance with already existent data privacy & transparency laws, like the EU’s GDPR, and not just because of PII in training data. The GDPR doesn’t just require that users have access to their PII and control over whether it remains public. It requires that available PII be accurate, which is a problem when LLMs generate mostly or entirely non-factual answers 3-27% of the time and get some of the supporting factual information of their responses wrong as much as 46% of the time, in a phenomenon dubbed broadly (and misleadingly) as “hallucination.”

Hallucinations are Bullshit

As I’ve described before, I don’t like the term “hallucination” because it describes a perceptual failure in a thinking mind. LLMs do not think or perceive, which makes them incapable of operating outside of predetermined parameters. For that reason I prefer the terms “confabulation” and “bullshitting.” Regardless of what we call these non-factual outputs though, they’re demonstrably impossible to prevent with empirically consistent modes of formal communication like code, to say nothing of email memos and call summaries, generated with information from what the linked paper calls the “normal world.” I would argue that normal language is both the subject of greatest desire from potential LLM customers and the source of its worst meltdowns, as in Google’s generative search which launched so catastrophically and memeably. To understand why though, I need to answer one of the main questions most people have about AI—

What Do LLMs Do?

The simplest answer to this question is that they ascribe labels—called tokens—to words. Those tokens are given a series of numerical values (called vectors) that correspond with what context they belong in and provide to the text around them. I find it easier to wrap my head around this conceptual stuff in the context of why LLM development requires a market-crippling volume of Graphical Processing Units (GPUs). The same way that we use numerical shorthand to ascribe dimensions to a piece of furniture or coordinates to locations on a map, LLMs use a similar shorthand to orient tokens to one another in a sort of “idea space.”

When you ask ChatGPT to write a song about horses, it combines data points for what words people use to describe horses with data points for lyrics i.e. rhyming, verse/chorus structure, etc. If you ask ChatGPT to “write a song about pool sharks swimming” though, it seems to (generically) describe billiard playing sharks and the seedy aquatic locales where they showcase their skills. No, really—

Contrary to ChatGPT’s output, I was not envisioning an ocean-dwelling hammerhead chalking up its cue.

As a human reader, there are two possible interpretations of pool sharks swimming:

Human experts at billiards, swimming in a nondescript body of water.
Sharks swimming in a pool of some type.

This prompt and its variations consistently get low quality responses from ChatGPT. I predicted they might, because in my professional experience, confabulations often result from a kind of ambiguity that no amount of data quality improvement can really account for. Depth of meaning is a defining quality of creative human expression, and also the one most unintelligible to LLMs. In the case of “pool shark,” the slang exists precisely because it plays on the linguistic closeness between the game and the swimming venue. But, by introducing “swimming” which applies to pools and sharks separately, but not "pool sharks" collectively, ChatGPT has three words and an over-abundance of conflicting correlation between them.

Outrunning Sunsets

Bear with me—

For the vast majority of human history, outrunning the sunset was impossible. Now, you have to caveat that impossibility with a latitude and a vehicle. Why hasn’t anyone bothered to test the mathematical theory on this? Consider the answer to these two questions:

What do we gain by trying?
What is the cost?

Bragging rights aside, the cost-benefit analysis of this aviation example bears on the extreme side towards “Not worth it.” In the same vein, whether or not researchers can design solutions to the confabulation problem is immaterial. At the time of writing this, there’s no charted course for large LLM creators to reverse the technology’s abject unprofitability, to say nothing of the data safety and environmental concerns.

To be clear, I’m not saying that Generative models are without their uses. I just don’t see a logical argument that those niche use cases outweigh the profound financial and ethical toll that we’re all being told will be worth it in the long run. No matter how much OpenAI CEO Sam Altman insists the contrary, LLM development does nothing to further the creation of Artificial General Intelligence (AGI). As far as I can tell, the most profitable use of these tools continues to be fraud.

Generative AI, like any other technology, is a tool. If that tool proves most useful for hitting nails, no amount of insistence that it's a pen or a screwdriver will make it anything but a hammer.

Any port in a storm

You Can't Unmake Soup

An Eye for an Eye