18 Comments
User's avatar
Alex Tolley's avatar

I 100% disagree with your premise. LLMs are guessing at the correct response in both correct and incorrect answers, but that is not the same as humans making experimental errors. It is more like human BS, or making illogical guesses at an answer. When we make errors in experiments, it is for a number of reasons, but guessing is not one. I don't even think it is similar to making correlation errors, or even extrapolation errors, as we can quickly check on LLM output, whereas extrapolations, almost by definition, have no data in the extrapolation regime to rely on.

Human BS is based on a lack of facts and poor logic. Politicians seem to be particularly prone to this, but probably, IRL, no more prone to this than most people, just that they are outspoken in public for all to see. Groupthink can lead in the same direction, as can following the thoughts of cult leaders. LLMs using statistical word and sentence prediction work similarly, IMO. However, when we do science or math, we don't do this. Experiments are designed with controls. Conclusions are drawn based on results. Discussions of the results, if speculative, are accepted as being beyond the data and are to be taken with skepticism. Math requires attention to the correct manipulations of symbols, not guessing at answers like young children.

Therefore, to be better, LLMs need to have different architectures. We can reduce hallucinations and improve accuracy, but I fear that unless we can add true understanding, their hallucinations may be as difficult to control as human dreaming, where we accept strange situations and actions that we would not IRL. This may require *ahem* consciousness, of some [limited?] sort. Enough to be able to work through possible guesses and strip out bad answers, and to work through logical chains without the same problems afflicting philosophers. [How does philosophy, with its attention to logic, arrive at different answers to teh same question?] The mixture of Experts models being tried in one way to improve accuracy, but I think just a palliative, rather like random forest decision trees. Understanding a mechanism to think through problems and how to solve them requires a different approach, and possibly an architecture that mimics the human thought processes of experts. When LLMs provide bogus citations, it implies that these are decorative and not actually used to extract information. Even ones that do provide real citations, they do not always make the correct determination of what the source says.

In some ways, Doug Lenat's Cyc was a theoretically better way to create an AI, but it fell apart under the weight of its many pieces of data objects. My interpretation is that AIs must have teh ability to store knowledge, algorithms for extracting and using this knowledge from sources and new data, and then apply the LLMs in the role of an interface to convey the answers. Having a BS machine, operating more like Kahneman's System 1 (fast) thinking, is not the way to go if we want AIs to be more than glib answering machines.

Expand full comment
Andrew Condon's avatar

What we have here with respect to the LLMs is something like an incommensurability problem - we, humans, do terrible violence to the facts by compressing things into narratives and the algorithms do terrible violence (is it really anything like as bad as humans? I’m not sure) by compressing everything into token prediction.

But humans for sure do the hallucination thing too all the time! And they’re mostly pretty unaware of it or play it down when it happens.

Expand full comment
Will Lowe's avatar

“When an AI comes up with something false, it’s doing exactly the same thing that it does when it gives you a correct answer!”

Quite so. But as I think others have noted, one way or another, this is more about truth (with a small 't') than about AI. Since small 't' truth is a semantic concept, it relates claims to things, and so it's very much part of the design that there won't be anything, save perhaps lack of internal consistency, that true claims always have and false claims always don't (and even consistency only serves to pick out what the claim claims).

Now, obviously llms have now read essentially everything anyone has publicly claimed but have still not met (m)any things, so fingers are going to multiply until they do. But as far as I can see the problem we create by insisting on reading them as making claims that are true is not really that, it's that not having met (m)any things means their claims about their claims are much less reliable than their claims. They are, in short, uncalibrated, and all the tiring obsequiousness and correctability and tendency to apologise is a harness strapped on to stop this lack of calibration getting them into too much trouble with users.

This seems like a genuinely hard problem. Happily or not, it's also a problem with people, so if we figure out ways to deal it with one lot, we ought to be make some progress on the other.

Expand full comment
Oliver's avatar

Fascinating! I've thought a lot about this too (I was at an LLM research lab in 2022), and done enough Buddhist insight meditation to see how some of my mental thought processes unfold, and I actually think the way humans and LLMs do language isn't all that different. The difference is with humans, we culturally transmit prompt-engineering tricks that we put in our chain-of-thought, so that the output has more rigour.

The simplest example of hallucination in LLMs is asking for book recommendations. Currently they model the distribution of likely book names, and (un)helpfully invent new book names that sound like they'd be mentioned on the internet. But this is a bad model of the data - the true distribution is a discrete set of actual books (but with a power law distribution over the elements of that set - most books are barely read).

In theory once they're big enough, trained long enough, and have the right inductive biases, they'll stop hallucinating book names. But it's hard to mathematically express ideas like "books, humans and historical events must be memorised from the training set or prompt, but interpolate everything else" to the training process, and there's long/expensive feedback loops if you mess up.

So I think what'll happen instead is we'll develop a mish-mash of RLHF and prompt-engineering to encode all of the cultural wisdom we teach our kids. (I disagree with the "[humans] don’t need a lot of hand-holding" point - it takes years to get kids to write coherent essays!) Things like "don't make stuff up", "say when you don't know", etc. We already have to do this somewhat because raw LLMs sound like bad reddit comments; you need to do a lot of work to distort the LLM to upweight the parts of the internet you like.

And the simplest example of humans hallucinating (back to my 'insight meditation makes me think humans are like LLMs' point) is generating arguments that don't pass the smell test. Ask someone what their life goals are, and their plans for achieving that goal ... and they'll start generating tokens. Run some basic consistency / rigour checks on that token sequence ... and you'll often find the plan vaguely grammatically resembles a plan with steps and such, but is a terrible model of reality! Often it'll have implicit assumptions you could falsify with a Google search. I wish I were making this up! Our culture fails to transmit the "if you play out this plan in your mind, how does it go?" prompt-engineering trick (what rationalists call "inner simulator"). This isn't that different from checking whether a book name is real; we're just applying a different set of coherence checks to constrain the set of valid token-sequences (like a type system).

Expand full comment
gregory byshenk's avatar

Some random thoughts in respons...

"When people produce misleading or incorrect conclusions by applying an imperfect-but-useful methodology to an incomplete-but-informative dataset, they are, indeed, doing exactly the same things that they do when they happen to come up with robust results."

To use slightly different terminology than yours... yes, when people are using a particular model that fails to generate robust results, they are indeed doing the same thing as when the model fails (assuming that they are actually "doing science" - that is, actually attempting to produce useful models rather than producing something that sounds like science for some other end). And the failure of a model suggests modification or replacement, what one might call "incorporating new ideas outside our existing paradigms" - or even new models/paradigms.

But an important part of this is that "doing science" (in the broadest sense - and leaving aside all manner of complications) is about modelling the world, something that we assume (with at least some good reason) to be consistent and coherent. Thus, when a model fails we reasonably assume that the model is what is wrong and we seek a new one. This is something that we have done (in various ways, long before we though about science) as long as there have been humans.

But LLMs are *not* modelling the world. They are modelling a vast collection of data that - importantly - is *known* to be inconsistent and incoherent. Thus, it is not clear even what it would what would be a criterion for rejecting a model, or even what it would mean to have a model "fail" or "succeed".

[This may be why LLMs can sometimes do well in certain specific domains, ones in which what they are modelling is itself a well-defined coherent (at least mostly) system. And also why "AI" systems can be very good at finding patterns in specific datasets.]

Expand full comment
Joe Jordan's avatar

I agree with with the point you open the article with. For years I have argued, when I am feeling cheeky, that the difference between a creationist and an evolutionary biologist is just that the creationist is wrong, not that they are being unscientific.

The point about expanding datasets I am less sure about though. The output of an LLM is the path in the high dimensional space that the token vectors live in. We humans have a method to draw new samples from this space and learn about it contours and ridges ("damn, that was stupid, I won't do that again"). But AI doesn't (in most cases). A lot of people making a lot of mistakes for a long time is how we know about the world and in particular status competition gives us an incentive to try to make the most productive mistakes. What is the corresponding mechanism for an LLM?

Expand full comment
TW's avatar

Technology democratizes access to something formerly reserved to an elite.

AIs may just democratize access to really smart, well-read people. Or "people."

Expand full comment
Andrew Kanaber's avatar

The truism about AI hallucinations is mostly trying to stop people being misled by the name. Hallucinating humans are in an altered state of consciousness with distinctive effects and usually a distinct biological cause. Whereas, as you say, AI hallucinations are normal operation.

Expand full comment
Andrew Kanaber's avatar

i.e., it's to avoid this conversation:

manager: write a bugfix ASAP to stop the model eating shrooms

techie: I'm afraid it's just like that all the time

Expand full comment
Marginal Gains's avatar

Interesting post. We also set a very high bar for the machines, and I commented about it earlier today on the same topic: https://tinyurl.com/mpr6czuh

Expand full comment
John Mutt Harding's avatar

"Progress in AI may be more dependent on progress in assembling training data than on faster chips or better algorithms". Has not all of internet already been used as training data? And more and more of new data on the internet will be AI-generated. So AI will feed on itself - or become incestious and degenerate.

Expand full comment
Kaleberg's avatar

This is why I think LLMs might be useful when trained on mid-sized, specialized, relatively dense, problem oriented datasets. We've seen this approach work in the sciences in high entropy metallurgy and protein structure prediction. These methods work well enough to be useful, but their answers have to be verified. They are not magic whatever their backers claim.

It helps to remember how LLMs work. There was a lot of work on analyzing text and trying to infer meaning from textual context, that is, figuring out things about words without actually understanding what the words meant but by the way they are used instead. What they discovered is that it was possible to map words into a higher dimensional space which tended to preserve analogies. For example, the vector direction and distance in that space between boy and man would be roughly the same as between girl and woman. Given this, one could do a certain sort of spatial reasoning that let one categorize and generate text.

One of the profound results in mathematics in the early 20th century was that doing mathematics involved working with symbols, but, in the end, the meaning of those symbols would be in the context of what mathematicians call a model. Any complex, regular system of symbolic manipulation would result in logical inconsistencies and statements which could be true or false as one chose. Different choices for truth or falsehood would map the symbols to different models. For example, much of geometry as taught in high school could refer to plane geometry, spherical geometry or hyperbolic geometry. The difference would be in the accepted version of Euclid's 5th postulate.

A mathematician could build a machine to mechanically prove a theorem, but determining the truth of the theorem and in which models it is true requires something else. Mathematicians say that when an android proves a theorem, nothing happens. The corollary is that when a human proves a theorem something does.

LLMs perform a form of mechanical reasoning, reasoning about symbols in a higher dimensional space. The entire field is based on the belief that the mapping of words into that space reflects their meaning. There is no reason to believe this is true, though that spatial mapping clearly captures some aspects of their semantics. An LLM producing a response based on that kind of mechanical manipulation will often be useful, possibly because language itself is useful.

A big limitation of LLMs, however, is that there is no model. There is nothing to tie the symbols to the real world. A lot of the more recent work with LLMs involves trying to tie the token sequences to real world constraints. So, we get step by step models that can produce better answers by using smaller integration steps. We get models that validate against themselves or external semantic models. There is something useful going on, but CEOs and investors are fundamentally stupid, and they're paid well to be stupid, so they let their imaginations roam free. The rest of us pay the price for their omission.

Expand full comment
John Mutt Harding's avatar

Thanks for this usefull explanation

Expand full comment
TW's avatar

Using all of it may be the issue. I suspect using everything math (or cooking, or physics, or whatever) is going to be LLM 2.0.

Expand full comment
Indy Neogy's avatar

Interesting post. I feel like at the beginning you're in line with the OpenAI theory of scaling, but by the end you've drifted towards a position I feel is stronger, which is that humans contain multiple conceptual schemes (limited, but with the ability to learn more) and are able to select from them (imperfectly) according to context to do (imperfectly again) error checking on their extrapolations.

(This might be a short explanation for what I often refer to even more shorthand-ly as "understanding the meaning of things.")

I'd argue that the current LLM architectures (as we know of them, they could have more in the "commercial privacy closet" are not well set up for this overall (although we can see some progress in doing it specifically for maths) and there's some substantial model wrangling required to do it. (substantial = $bns in coding/training/time)

Expand full comment
John Quiggin's avatar

There's a clear line of descent from the ancestral combination of

stepwise regression+discriminant analysis through machine learning through neural networks to LLMs

People who are aware of this (mostly economists( use "data mining" as a pejorative. Those who aren't use it favorably.

Expand full comment
Sam Tobin-Hochstadt's avatar

I think it's correct that the true things we learned in social psychology via garbage statistics on undergrad experiments were produced the same was as the wrong things. But that shows that the correct lesson to draw is something else -- that method was garbage in the first place and needed wholesale replacement. I don't have an application to AI handy from that observation but maybe someone will come up with one.

Expand full comment
Chris Deliso's avatar

‘This is very wholesome stuff, every word a sermon in itself.’ 9(F.O'B)

-And not completely without application for the writing of detective fiction.

Thanks, Dan.

Expand full comment