Stray Thought Adam Stone's Home on the Web

Post feed: #language

Why Every "AI" Conversation Feels Like Nonsense

Another “AI” Post?

I really didn’t want to write about artificial intelligence (AI) again so soon, and I promise this will be the last one for a little while, but I came away from my last two posts with a very unsettled feeling. It wasn’t the subject matter itself that bothered me, but the way I ended up writing about it.

In writing those posts, I made an intentional decision to be consistent with the vernacular use of the term “AI” when discussing a specific set of popular tools for generating text. Surely, you’ve heard of tools like ChatGPT, Claude, and Gemini. I did this because using more precise language is (1) awkward, (2) inconsistent with the terminology used in the podcast that prompted my posts, and (3) so unfamiliar to a layperson that accurate language comes off as pedantic and unapproachable. In the interest of making the posts simpler and more focused on the issues raised by the way these tools are being used, I ended up just accepting the common language used to talk about the tools.

But the truth is that I know better than to do that, and it bothered me. ChatGPT ≠ AI, even if it’s very common to talk and write about it that way. As time passed, I felt worse and worse about the possibility that by accepting that language, I gave the impression that I accept the conceptual framing that these tools are AI. I do not. In this post, I intend to make an addendum/correction to clarify my position and add some context.

A still image from the movie The Princess Bride showing Inigo Montoya looking at Vizzini. A caption in meme font reads: You keep using that word. I do not think it means what you think it means.

What’s Wrong with the Term “Artificial Intelligence”?

People have been confused about what “artificial intelligence” means for as long as they’ve used the term. AI, like all of computer science, is very young. Most sources seem to point to the 1955 proposal for the Dartmouth Workshop as its origin. Right from the start, it seems to have been chosen in part to encompass a broad, uncoordinated field of academic inquiry.

By the time I was a computer science undergraduate (and, later, graduate student responsible for TAing undergrads) in the late 90s, little had changed. The textbook I used, Artificial Intelligence: A Modern Approach (AIMA) by Russell and Norvig, didn’t even attempt to offer its own definition of the term in its introduction. Instead, it surveyed other introductory textbooks and classified their definitions into four groups:

  • Systems that think like humans
  • Systems that act like humans
  • Systems that think rationally
  • Systems that act rationally

The authors describe the advantages and drawbacks of each definition and point out later chapters that relate to them, but they never return to the definitional question itself. I don’t think this is an accident. If you take AIMA as an accurate but incomplete outline, AI consists of a vast range of technologies including search algorithms, logic and proof systems, planners, knowledge representation schemes, adaptive systems, language processors, image processors, robotic control systems, and more.

Machine Learning Takes Over

So, what happened to all this variety? The very short answer is that one approach really started to bear fruit in a way nothing else had before. Since the publication of the first edition of AIMA, researchers produced a host of important results in the field of machine learning (ML) that led to its successful application across many domains. These successes attracted more attention and interest, and over the course of a generation, ML became the center of gravity of the entire field of AI.

If you think back to the technologies that were attracting investment and interest about 10 years ago, many (perhaps most) were being driven by advances in ML. One example is image processing—also called computer vision (CV)—which rapidly progressed to make object and facial recognition systems widely available. As someone who worked on CV back in the bad old days of the 90s, I can tell you these used to be Hard Problems. Another familiar example is the ominous “algorithm” that drives social media feeds, which largely refers to recommendation systems based on learning models. Netflix’s movie suggestions, Amazon’s product recommendations, and even your smartphone’s autocorrect all rely on ML techniques that predate ChatGPT.

Herein lies the first problem with the way people talk about AI today. Though much of the diversity in the field has withered away, AI originally encompassed an expansive collection of disciplines and techniques outside of machine learning. Today, I imagine most technologists don’t even consider things like informed search or propositional logic models to be AI any more.

The Emergence of Large Language Models

In the midst of this ML boom, something unexpected happened. One branch of the AI family tree advanced unexpectedly rapidly when ML techniques were applied: natural language processing (NLP). Going back to my copy of AIMA, the chapter on NLP describes an incredibly tedious formalization of the structure and interpretation of human language. But bringing ML tools to bear on this domain obviated the need for formal analysis of language almost entirely. In fact, one of the breakthrough approaches is literally called the “bag-of-words model”.

The first edition of AIMA open to the NLP chapter. Visible on these pages are a complicated parse chart for a simple sentence (I feel it), a table labelling the chart with parsing procedures and grammatical rules, and a parse tree for a slightly more complex sentence.
A sample of what the NLP chapter of AIMA looked like before everything became a bag of words.

What’s more, ML-based language systems demonstrated emergent behavior, meaning they do things that aren’t clearly explained by the behavior of the components from which they are built. Even though early large learning networks trained on language data contained no explicit reasoning functionality, they seemed to exhibit that behavior. This is the dawn of the large language model (LLM), the basis of all of the major AI products in the chatbot space. In fact, this technology is the core of all of the most talked-about products under the colloquial AI umbrella today.

Here’s the second problem: people often use the term “AI” when they really this specific set of products and technologies excluding everything else happening in the field of ML. When someone says “AI is revolutionizing healthcare,” they might be referring to diagnostic imaging systems, drug discovery algorithms, or robotic surgery assistance, or they could be talking about a system that processes insurance claim letters or transcribes and catalogs a provider’s notes. The uncertainty makes evaluating these claims almost meaningless.

The Generativity Divide

There’s another important term to consider: “generative AI.” It describes tools that produce content, like LLM chatbots and image generation tools like Midjourney, as opposed to other ML technologies, like image processors, recommendation engines, and robot control systems. Often, replacing the overbroad “AI” in casual use with “generative AI” captures the right distinction.

And that’s an important distinction to draw! One unfortunate result of current “AI” discourse is that the failings of generative tools, such as their tendency to bullshit, become associated with non-generative ML technologies. Analyzing mammograms to diagnose breast cancer earlier is an extraordinarily promising ML application. Helping meteorologists create better forecasts is another. But they get unfairly tainted by association with chatbots when we lump them all together under the “AI” umbrella.

Consider another example: ML-powered traffic optimization that adjusts signal timing based on real-time conditions to reduce congestion. Such systems don’t generate content and don’t lie to their users. But when the public hears “the city is using AI to manage traffic,” they naturally imagine the same unreliable systems that invent bogus sources to cite, despite the vast underlying differences in the technologies involved.

That said, we can’t simply call generative AI risky and other AI safe. “Generative AI” is a classification based on a technology’s use, not its structure. And while most critiques of AI, such as its impact on education, are specific to its use as a content generator, others are not. All learning models, generative and otherwise, require energy and data to train, and there are valid concerns about where that data comes from and whether it contains (and thus perpetuates) undesirable bias.

The Business Case for Vague Language

Why does this all have to be so confusing? The short answer is that the companies developing LLMs and other generative tools are intentionally using imprecise language. It would be easy to blame this on investors, marketing departments, or clueless journalists, but that ignores the ways technical leadership—people who should know better—have introduced and perpetuated this sloppy way of talking about these products.

One possible reason for this relates to another term floating around: artificial general intelligence (AGI). This is also a poorly-defined concept, but researchers who favor building it generally mean systems with some level of consciousness, if not independent agency. For better or worse, many of the people involved in the current AI boom don’t only want to create AGI, they believe they are already doing so. Putting aside questions of both feasibility and desirability, this may explain some of the laxity in the language used. AGI proponents may be intentionally using ambiguous, overgeneralized terminology because they don’t want to get bogged down in the specifics of the way the technology works now. If you keep your audience confused about the difference between what’s currently accurate and what’s speculative, they are more likely to swallow predictions about the future without objection.

But I think that’s only part of what’s happening. Another motivation may be to clear the way for future pivots to newer, more promising approaches. Nobody really understands what allows LLMs to exhibit the emergent behaviors we observe, and nobody knows how much longer we’ll continue to see useful emergent behavior from them. By maintaining a broad, hand-wavey association with the vague concept of “AI” rather than more specific technologies like LLMs, it’s easier for these companies to jump on other, unrelated technologies if the next breakthrough occurs elsewhere.

Speaking Clearly in a Messy World

That makes it all the more important for those of us who do not stand to gain from this confusion to resist it. Writing clearly about these topics is challenging. It’s like walking a tightrope with inaccuracy on one side and verbosity on the other. But acquiescing to simplistic and vague language serves the interests of the AI promoters, not the community of users (much less the larger world!).

From now on, I’m committing to being more intentional about my language choices when discussing these technologies. When I mean large language models, I’ll say LLMs. When I’m writing about generative tools specifically, I’ll use “generative AI.” When I’m talking about machine learning more generally, I’ll be explicit about that, too. It might make my writing a bit more cumbersome, but this is a case where I think precise language and clear thinking go hand in hand. And anyone thinking about this field needs to be very clear about the real capabilities, limitations, and implications of these tools.

The stakes are too high for sloppy language. How we talk about these technologies shapes how we think about them, how we regulate them, and how we integrate them into our lives and work. And those are all things that we have to get right.


Hijacking Mercy with the Language of Power

Looking up from below at the hundreds of block-shaped weathered steel monuments suspended from the ceiling of the National Memorial for Peace and Justice
photo by Ron Cogswell: The National Memorial for Peace and Justice website (Montgomery, Alabama 2019)

I first encountered Bryan Stevenson and his work nine years ago. Google, my employer at the time, invited him to speak about his nonprofit, the Equal Justice Initiative (EJI), which provides legal representation to people who have been subjected to unjust treatment during criminal proceedings or incarceration. His talk, which is posted on YouTube, moved me profoundly, and I went on to read his book, Just Mercy.

A core idea motivating Bryan Stevenson’s work, laid out both in that talk and in Just Mercy, is that “Each of us is more than the worst thing we’ve ever done.” In his context, that “worst thing” could be quite terrible. While its work includes representing the falsely accused, EJI often represents people who truly have committed heinous crimes.

It takes a special kind of moral clarity to represent and seek justice for a murderer who has been abused in prison. After reading Just Mercy, I found myself much more attuned to the many ways our society discards those deemed unworthy of fair treatment. We can’t call justice a foundational social value if it is contingent. If there are things we can do to have the protection of the law withdrawn from us, then that protection isn’t really meaningful. Everyone whose work touches criminal investigation, trial, or punishment should be held to the highest standards because of, not despite, their impact on those who may have done horrible things.

Lately, however, I’m troubled to hear this language of mercy and second chances voiced in some unexpected places.

When venture capital firm Andreessen Horowitz (a16z) hired a man freshly acquitted for choking Jordan Neely to death on the New York subway, they told their investors, “We don’t judge people for the worst moment in their life.” Notably, nobody disputes the fact that a16z’s new investment associate killed a man on the floor of a subway car. He was acquitted only because a jury did not deem that act to be a crime.

Similarly, when a 25-year-old staff member of the new “Department” of Government Efficiency resigned over vicious tweets endorsing racism and eugenics, the Vice President of the United States dismissed it as “stupid social media activity” that shouldn’t “ruin a kid’s life.” The staffer was promptly reinstated into his role as a “special government employee.”

These echoes of Stevenson’s words might sound familiar, but they deserve careful scrutiny. Have a16z or the current administration ever invoked mercy as a broader goal? Not so far as I can tell. Beyond a handful of podcast episodes posted five years ago, Andreessen Horowitz’s engagement with criminal justice seems limited to investing in drones and surveillance technology. Their founding partner made such large personal donations to the Las Vegas police that he felt the need to write a petulant response when investigated by journalists. Regardless of how worthwhile it may be to buy drones and cappuccino machines for the police, I think it’s fair to say that these investments do nothing to advance the cause of building a merciful society. As for the administration, its tough-on-crime rhetoric speaks for itself.

So what’s really happening here? One obvious answer is that hiring a killer and refusing to accept the resignation of an unapologetic bigot are ways to sanction their behavior. We like to believe we maintain social norms against killing someone when it could be avoided and against racist public statements. Rejecting these norms requires some affirmative signal, and that’s what a16z and the administration are providing.

This reveals the first crucial distinction: Stevenson calls on society to recognize the rights of those who do wrong. In contrast, these recent cases are declarations that certain acts are not wrong at all, and even more troublingly, they suggest that those who still maintain longstanding social norms are themselves the wrongdoers.

But there is a second important difference, one hidden by rhetorical sleight of hand. When EJI takes a case, they fight for fundamental civil rights: the right to a fair trial, to representation, to be sentenced fairly, and to humane punishment. By contrast, in these other cases, the talk of “ruined lives” is a misdirection disguising what’s really happening: the loss of positions of extraordinary influence and privilege.

When the Vice President claims a racist’s life has been ruined, he’s talking about someone losing a powerful role in the federal government, treating an immense privilege as though it were a basic right. And if we take a16z at their word, they seem to be claiming that it would be unfair not to elevate a killer to the investing team at one of Silicon Valley’s biggest venture firms. They’ll have to forgive my skepticism that you or I would get very far with that approach if we applied for such a coveted role.

I’m reminded of the 2018 Brett Kavanaugh Supreme Court confirmation hearings, when supporters claimed that sexual assault allegations against him were intended to “ruin his life.” Obviously, they didn’t, but even if they had prevented his appointment to the Supreme Court, was that his right? Would his life have been ruined had he instead returned to his previous role as a federal judge on the DC Circuit Court of Appeals? I’d wager many people would gladly accept that kind of “ruined” life.

But of course, this is nothing more than a flimsy, bad faith attempt to cloak approval for violence and eliminationism in lofty language. If you spend your time, like Bryan Stevenson, ensuring that the poor and marginalized are afforded basic rights and dignities, then I believe in your commitment to mercy. If, on the other hand, you spend your time granting vast privileges to people who harm the poor and marginalized, then you’re not showing mercy—you’re showing your hand.