Dan Davies - &quot;Back of Mind&quot;

chris s

Jan 29Edited

Right, it's more akin to AlphaGo generating an alien way of playing the game except without the constrains of any rules.

Expand full comment

Ducky McDuckface

Feb 19

There was a paper a while back that claimed that for certain arcade games (Asteroids, Centipede, Galaga, Ms PacMan, etc) an ML model was several orders of magnitude better (higher scores) than the best human player.

Oddly, the model was much better at Ms PacMan than the original game. When the model was examined, the actual gameplay employed, the model was attempting to push through walls several thousand times a second, before attempting a move within the maze.

Expand full comment

(https://en.wikipedia.org/wiki/Long_Depression)

Jan 29Edited

My opinion on AGI and surpassing human-level intelligence is not important. However, the AI boosters' claims of massive sudden increases in GDP are extraordinary and need extra-ordinarily robust justification, not mere hand-waving. Sure, assume AI cracks fusion power, say. It'll still be two or three decades before we get a few pilot plants built, and several more decades before there's meaningful economic impact.

Yes, jobs dealing with information, particularly where mistakes don't result in explosions or collapses, are at risk. But the decay process will be slow. We may be in for a repeat of the Long Depression of 1873 to 1899-ish, back to back with the Great Depression of the 1930s.

OK, can 't resist. Why does no one talk about Moravec's Paradox any more? (https://en.wikipedia.org/wiki/Moravec%27s_paradox) Doing things that humans find "skilled" does not impress me. Make me a sandwich. (https://xkcd.com/149/)

Expand full comment

Doug K

or drive me a car ;-)

fold my laundry, wash my dishes.

Then I'll be impressed. Stochastic parrots are not impressive.

thank you for Moravec, the Paradox is what I am constantly citing but still haven't got it into the discourse.

I first encountered it while doing AI for the military in the 80s, when AI was expert systems. That experience inoculated me against AI credulity.

AlphaGo works because the problem space is perfectly defined and limited.

From the Nature paper written by the AlphaGo Zero (AGZ) researchers,

"Here we introduce an algorithm based solely on reinforcement learning,

without human data, guidance or domain knowledge beyond game rules."

The problem with general intelligence AI is that there are no game rules.

As commenter Aardvark noted on John Scalzi's review of a new phone,

It’s not like LLMs work OK sometimes and “hallucinate” sometimes. They are never not “hallucinating.” It’s just that when they’re on solid ground with their training data, and haven’t been given a perverse prompt, the hallucinations look like what we expect from reality.

Julia Carrie Wong,

it’s just a word generating machine generating words the only meaning is that which you read into it and imo that should be none

when you examine a text written by a human you can find layers and layers of meaning and intentionality, the complexity of the human consciousness, an opportunity for one mind to commune with another outside the bounds of time and space. when you examine AI text you drown in a teaspoon of nothing.

Expand full comment

Expert systems history illustrates what is likely to happen. I used to have a book describing a pilot study in one of the better known hospitals in the NHS in England, with an expert system diagnostic tool. It massively outperformed the hospital specialists, but was abandoned after fierce but quiet back-room opposition from those same doctors.

It is as likely today, as then, that the unions--ahem: professional associations--will be able to resist very effectively.

Expand full comment

Tim Wilkinson

Can you remember any other details about this book that you used to have, such as its title?

Expand full comment

No, sorry. I'm fairly sure it was a hardback, and a description of various facets of software engineering, but that's all.

Expand full comment

Tim Wilkinson

Oh OK, I was just wondering in case there could have been any signal degradation along the path from any actual events to the above comment.

Because I had never heard of these robot consultants (diagnosticians?) and the idea that their competence had got exaggerated along the way, and the failure of a system wrongly blamed on militant doctors, is quite plausible, given for example the ongoing attempt to replace doctors with decidely inexpert 'physician associates' (previously 'assistants'): https://www.ft.com/content/5a533507-f11d-42b2-b67e-e10c0d7c9fb8 . In this case too, doctors' objections are portrayed as 'toxic' protectionism; see top and tail of this Beeb version: https://www.bbc.co.uk/news/articles/c2dly5ldrxjo .

Expand full comment

Jan 31

Absolutely, it's all politics, from all sides. My point was there's no reason to think things are different this time. The AI shysters are likely to find the same thing that the Expert System promoters did before them: it ain't that simple.

Expand full comment

I'd put forward that raw error rate compared to humans is not the whole picture on accuracy. To whit, a key problem with implementing LLMs in the real world we currently have is that their failure pattern is *very unlike* the failure pattern of humans. Humans do scatter mistakes all around, but it's generally assumed - by other people and by the systems/infrastructure we've built - that they will tend to overperform their average error rate on some types of input ("easy" questions, things that they have clear competency at, and critically classes of problem that they've already shown they can succeed at) and underperform on others (the opposite).

LLMs tend to have a much more evenly distributed error rate, and fairly regularly fail at things that look, to humans and human-built systems, extremely similar to things that they have previously succeeded at - see e.g. https://arstechnica.com/ai/2024/10/llms-cant-perform-genuine-logical-reasoning-apple-researchers-suggest/. This is, I think, one of the core in-practice challenges that LLMs are having/will continue to have as people try to roll them out: they fail in patterns that our existing infrastructure (including the human parts of it) is not equipped to deal with well.

Expand full comment

Simon Rogers

Great article.

I'm increasingly uncomfortable with the concept of hallucination though. It implies that the model is doing something different when it creates sequences of tokens that are "right" compared to when they are "wrong". But in reality, the process is identical. And it's us dichitomising them into right and wrong.

Even if we could fix this, who decides what is right and what is wrong?

Expand full comment

Claire Hartnell

Great post. The complexity point is very important. It's the 'word processors will save paper' point. That didn't happen - whereas plumbing really did save trips to the river and refrigeration really did reduce food spoilage. So much of the digital plumbing being put into companies is creating more complexity, not less. Typing every thought / task / outcome into a data form does not help the goal to be more effectively advanced. Post-it notes have contributed far more to problem solving than all the painful digital project management / people surveillance tools than now proliferate.

Also, re: hallucinations. I'm not sure it's just a fitting problem? If that were the case, then hallucinations would be sensitive to levels of training data. But I'm not sure there's any correlation there? Isn't the problem that the beast doesn't know what it doesn't know? Of course, many humans share this problem but we can reduce this noise / bias with collaborative thinking tools (scientific method, dialectic method, understanding of fallacies / cognitive biases etc). So in a really well structured environment, the hallucinations can be outed through back and forth discussion. This does not exist in an AI algorithm. It may be that training them on one another can produce crowd-sourced precision (like humans). But we do surely reach a point of reductio ad absurdam here. At what point is the energy cost of creating groups of ultra-intelligent, arguing AIs more efficient than getting a group of well facilitated humans to work through a problem? (with assistance on many tiresome tasks from the machines).

Expand full comment

Dave Guarino

AI-skeptical commentary I take seriously:

1. Here is the output of $FRONTIER_MODEL when I ask question X/try to do task X. These are the dimensions on which it is wrong. The median human response freely accessible to people is better than this. It is unlikely to get better because there is no source of training data publicly available, and no business model incentivizes creating said data.

2. There is no 2.

Expand full comment

Tim Wilkinson

The most glaring & fundamental version of the 'agile sprint down a blind alley' problem in this context seems to me to be the focus on 'how soon can we start replacing the less intelligent white-collar humans?' rather than 'are we on the road to producing something that can identify, isolate & explain novel solutions to our difficult problems?'

Turing has a lot to answer for here, but that was a very long time ago & I think it has more to do with the resources for such research having mostly been in the hands of a bunch of spivs embedded in a fast-buck-worshipping culture. Maybe the Chinese will come up with something more interesting.

Expand full comment

Jaroslav Sýkora

The contemporary transformer architecture could not learn any new thing by itself, because the training phase and inference phases are strictly separated. That, and the limited context length make me think that AGI/ASI is not possible now.

Deepseek showed us that the transformer architecture works better than we thought, resource-wise. That will enable everybody to slap transformers to everything. You could finally talk with your toaster, explain which kind of toast you desire today and then spend 20+ minutes to persuade "it" to finally energize the electrical spiral to really make the toast for you.

Some jobs will be lost, some new jobs gained. As you mention this AI is great in "analogue" cases where a small error in AI's output is irrelevant or translates into only a small tolerable error in real-world action. In cases when even a small error leads to catastrophic problems, transformers are not so much useful. I am not decided if programming belongs to the first or the second category.

Expand full comment

dribrats

The nice thing about writing for an audience on both sides of the Atlantic is that you have a "license" to use whichever spellings you please.

Expand full comment

Patrick Durusau

A goal line so ill-defined that the race is over when one or more runners declare victory deserves skepticism. That's AGI in a nutshell.

Expand full comment

Dan Kärreman

Yeah, technology replacing a human, as per Altman, is a bar so low we passed it centuries, if not millennia, ago. If only we could agree on what intelligence actually is. Displaying elaborate search results from prompts, while impressive, does not quite cut it.

Expand full comment

Sam Tobin-Hochstadt

On both the booster and skeptical sides, I think being clearer about what you mean about AGI is necessary. It's clear that for many definitions of AGI from 15 years ago, it has already been reached, and that chatgpt is far more intelligent at most tasks than most people. On the other hand there are lots of things people do that AIs don't do, and so claims about the redundancy of humans are far overblown. But this is exactly what we should expect for any technology that replaces some human work -- combines and steam engines don't work exactly like peasants with scythes or horses. So it's necessary to be clear about what the questions are, and what kinds of answers one is looking for, in a way that the discussion of AGI rarely is.

Expand full comment

Reply (2)

Dan Davies

I don't think I can agree with that - the "G" has to stand for "general" surely, so if there are loads of things that human beings can do which the AI can't, it's not AGI?

Expand full comment

Sam Tobin-Hochstadt

This is what I mean about needing actual definitions. One possible definition is that it should be indistinguishable from a person, like the replicants in Blade Runner. Obviously Chatgpt isn't anything like that. A different definition is something like the Turing test. Modern LLMs definitely pass that. Both of those are plausibly "general" so you need to know what you're saying before having a discussion.

Expand full comment

Philip Koop

You are just rephrasing Tesler's Theorem ("AI is whatever hasn't been done yet".) This is a favourite of AI boosters and obviously has some foundation. The trouble is that these boosters have pushed it so far they've landed on "AGI is whatever has already been done". I have yet to see an adequate rebuttal to Chollet's objections, for example.

Expand full comment

Dean

>I think that AI, like every other information technology, will end up creating complexity as well as processing it, that the robots will get in each other’s way just like we do

This is a good point. We already have general intelligence. The issue is coordinating it.

I sometimes think LLM/AI will be useful in the way that PowerPoint is useful, but not like electric power is useful.

Expand full comment

Indy Neogy