The Intriguing Cognitive Puzzle of Fabulist AIs
In which we investigate the tendency of large language models to convincingly make s**t up and what it might mean for the nature of cognition.
In my previous post, I mused about how computer-mediated communication could change once computers are able to converse as fluently and effortlessly with natural languages as human beings.
But there is something important I didn’t mention in that post that is worth expanding on: while the imitation game was considered the ultimate test when Alan Turing proposed it, it is starting to feel inadequately vague as we inch closer to machines passing it.
One important issue that is emerging today is the difference between propriety of language and embodied awareness of the world (or, as one would say, “common sense”). I mentioned before that interacting with an LLM feels like talking to a person with the propriety of language of a college professor and the understanding of the world of a 5 years old. This just does not happen with humans. We do not have words to describe a person that behaves like this, mostly because propriety of language builds on top of sophisticated understanding of the world; humans don’t appear to be able to have one without the other.
One word that I have seen used at $day_job for this state which I find particularly evocative is “fabulist”: the AIs make s**t up (just like 5yo’s do), but they do so with a propriety of language that makes it feel believable. The experience can be jarring.
This feels remarkable: simply by learning to predict how to complete sentences where words were randomly removed, LLMs have basically distilled the “statistical mechanics” of words and can make use of it to speak the language very well.
But when interacting with it, it doesn’t take long to sense that it knows very little about the things the words are representing… but not nothing! It knows something about the words and how they combine and in what context, enough to use the language well, but what is it exactly?
When we interact with a 5yo, we recognize that they know something about the world, but maybe not enough to coherently explain it or make full inference chains. They are not operating randomly, yet there is a lot that is happening that feels random. Their propriety of language seems to reflect that state. When we interact with LLMs, they feel similarly somewhat-random in the way they come up with thoughts, but with a structured and curated layer on top of that randomness.
There are lots of questions that these considerations trigger in me.
First off, how can we cure fabulism in LLMs or at least mitigate it? Letting them loose on the world in their current fabulist state would be the “memetic pandemic” equivalent of an automated virus generator. As Ade has insightfully suggested during a conversation, coupling it with ads targeting systems seeking conspiracy minded people, would basically turn it into QAnon-as-a-service. The amount of societal harm this could generate is astounding.
Second, are the LLMs learning latent properties about the entities behind the words while learning their statistical mechanics? And if so, what are they? Can we extract them? Can we tweak them? Can we augment them with established knowledge representation systems (such as knowledge graphs)? Or, alternatively, can we use such statistical mechanics of language to augment knowledge graphs and make them more nuanced and useful?
There is a lot here that feels both promising and dangerous, but I can’t help find myself intrigued by how the evolution of artificial cognition is starting to reach the point of uncovering truly alien states of mind that we have never encountered before.