An unofficial motto of this ‘stack is, of course, “share the pain”. When there is an irritating, possibly wrong but persistently troubling idea buzzing round my head, I sometimes find temporary release by inflicting it on you lot.
Today’s ration of suffering comes from something I started wondering with respect to a common truism about AI hallucinations:
“When an AI comes up with something false, it’s doing exactly the same thing that it does when it gives you a correct answer!”
This is not wrong! (I’ve said something similar myself on a number of occasions, believing I was being clever on most of them). But the thing that bugs me about it is … isn’t this also true of a lot of human inquiry?
For example, think about experimental psychology, or behavioural economics or nutrition or psephology, or some other area of science with a hell of a lot of spurious or otherwise problematic results. What’s the difference between the good and the bad? Definitely not anything systematically methodological. When people produce misleading or incorrect conclusions by applying an imperfect-but-useful methodology to an incomplete-but-informative dataset, they are, indeed, doing exactly the same things that they do when they happen to come up with robust results. If the reproducibility crisis had hit a few years later, we’d be talking about it as a problem of hallucinations.
I think what is going on here is that text-based LLMs have a structural inclination to really, really rub our noses in the fact that extrapolation from a dataset is difficult. The magical-seeming property that they have comes from their ability to interpolate from their training set and create valid sentences out of tokens. But the thing we find difficult to get our heads around is that the space of tokens is funny-shaped, complicated and multidimensional, and it’s not easy to understand when the input string gets interpreted as an instruction to go outside the space.
Which makes me think that progress in AI may be more dependent on progress in assembling training data than on faster chips or better algorithms. The way that human beings deal with the problem of extrapolation is to gather more data in a targeted way – if you can bring a case within the scope of the dataset, it’s no longer a problem of extrapolation.
But human beings are really good at incorporating new events and data points; unlike LLMs, they don’t need a lot of hand-holding, massaging and processing to get the new data into a format that’s consistent with their existing conceptual scheme. Although we are by no means perfect at incorporating new ideas outside our existing paradigms, we’re much better at it than currently existing neural networks.
And I think it’s this ability to switch contexts and conceptual schemes (what I generally refer to as “accounting systems”, the often unconscious decisions we make about what part of the variety of the world we want to be able to represent) which is the big problem of “AGI”. The ability to process information in a useful and intelligent way shouldn’t be measured in bits per second, nor should datasets be measured in terabytes. The important thing is the number of genuinely independent conceptual schemes that you have available; diversity is strength.
I 100% disagree with your premise. LLMs are guessing at the correct response in both correct and incorrect answers, but that is not the same as humans making experimental errors. It is more like human BS, or making illogical guesses at an answer. When we make errors in experiments, it is for a number of reasons, but guessing is not one. I don't even think it is similar to making correlation errors, or even extrapolation errors, as we can quickly check on LLM output, whereas extrapolations, almost by definition, have no data in the extrapolation regime to rely on.
Human BS is based on a lack of facts and poor logic. Politicians seem to be particularly prone to this, but probably, IRL, no more prone to this than most people, just that they are outspoken in public for all to see. Groupthink can lead in the same direction, as can following the thoughts of cult leaders. LLMs using statistical word and sentence prediction work similarly, IMO. However, when we do science or math, we don't do this. Experiments are designed with controls. Conclusions are drawn based on results. Discussions of the results, if speculative, are accepted as being beyond the data and are to be taken with skepticism. Math requires attention to the correct manipulations of symbols, not guessing at answers like young children.
Therefore, to be better, LLMs need to have different architectures. We can reduce hallucinations and improve accuracy, but I fear that unless we can add true understanding, their hallucinations may be as difficult to control as human dreaming, where we accept strange situations and actions that we would not IRL. This may require *ahem* consciousness, of some [limited?] sort. Enough to be able to work through possible guesses and strip out bad answers, and to work through logical chains without the same problems afflicting philosophers. [How does philosophy, with its attention to logic, arrive at different answers to teh same question?] The mixture of Experts models being tried in one way to improve accuracy, but I think just a palliative, rather like random forest decision trees. Understanding a mechanism to think through problems and how to solve them requires a different approach, and possibly an architecture that mimics the human thought processes of experts. When LLMs provide bogus citations, it implies that these are decorative and not actually used to extract information. Even ones that do provide real citations, they do not always make the correct determination of what the source says.
In some ways, Doug Lenat's Cyc was a theoretically better way to create an AI, but it fell apart under the weight of its many pieces of data objects. My interpretation is that AIs must have teh ability to store knowledge, algorithms for extracting and using this knowledge from sources and new data, and then apply the LLMs in the role of an interface to convey the answers. Having a BS machine, operating more like Kahneman's System 1 (fast) thinking, is not the way to go if we want AIs to be more than glib answering machines.
What we have here with respect to the LLMs is something like an incommensurability problem - we, humans, do terrible violence to the facts by compressing things into narratives and the algorithms do terrible violence (is it really anything like as bad as humans? I’m not sure) by compressing everything into token prediction.
But humans for sure do the hallucination thing too all the time! And they’re mostly pretty unaware of it or play it down when it happens.