stochastic collywobbles
the computer is your colleague, that's why it's so annoying
NEW BOOK ALERT: Long term readers will recall that I recommended Gill Kernick’s “Catastophe and Systemic Change: Learning from the Grenfell Tower Fire and other disasters”” a few years ago, saying that it was “an extremely good and important book which deals with questions of accountability, complex systems and disastrous failure in exactly the sensitive and perceptive way that I was scared I wouldn’t”.
There is now a second edition available for pre-order, which is greatly revised from the first. The conclusion of the Grenfell Inquiry has made a lot of new information available, but more importantly, has made it much less legally dangerous to be specific about companies and building products which contributed to the disaster. I was lucky enough to blag a pre-publication copy and think anyone who is interested in the general subject matter of “Why don’t we learn?” and “How can we change?” will really enjoy the new edition. If you’re engaged with YIMBY adjacent topics in construction regulation, I would say it’s absolutely required reading.
Anyway, on with the show. I will try not to let the length of this one get out of control, because it’s one of my favourite subjects – the potential for AI to replace middle managers. I am going to do people like Matthew Prince and Jack Dorsey the courtesy of taking them seriously and assuming that they mean “actually replacing middle managers” here, rather than something anodyne like “increasing the productivity of middle managers so that each manager can supervise a larger unit”. Because if all they meant was “I have changed the capital/labour ratio in a company I run” then come on, that’s a bit of a waste of our attention.
Here are a few statements I regard as axiomatic:
1. Actually running a business is more difficult than buying and selling shares on a liquid market.
2. Picking stocks to outperform the market is something which a neural network can do, we know this because human brains do it.
3. However, most human beings, including most human beings who are paid to do so, can’t.
To me, this implies, straight off the bat, that trying to beat the equity market is an at-least AGI level task, and that trying to manage an organisation is potentially super-AGI level. (I’ve made a similar argument in the past, suggesting that a computer which was really capable of handling planning applications would have to be so smart that we would long since have handed over the whole of government to it). This seems to me at least to be either a refutation of the idea that “AI can replace 20% of my managers”, or a massive self-own on the part of people implicitly admitting that their business is so simple, uncompetitive and overstaffed that current generations of Claude can handle it.
[MASSIVE RABBIT HOLE WHICH I WILL POINT TO BUT NOT GO DOWN. Of course, part of the problem is that stock-picking is an adversarial process in which people compete with one another; doubling everyone’s cognitive capacity in the market would just make it more efficient without changing the proportion of people capable of generating outperformance. But so is managing a business, in most sectors.]
Anyway, I have about 500 words left, so I will sketch out why I think that LLMs, specifically, will have a hard time even solving the easier problem of stock picking. I start from the observation that there are a lot of very successful computer systems which beat the stock market using machine learning; they just don’t look very much like large language models.
In fact, quant systems get their edge over a market dominated by human beings precisely because they don’t reproduce human reasoning; they have rules and stick to them. There are two ways to underperform the stock market – by hanging on stubbornly to losers, or by getting the collywobbles and exiting from a good trade too quickly. (The proverb is that “amateurs go broke by taking big losses, professionals go broke by taking small profits”).
So imagine that we have an LLM with appropriate training and a system prompt that says “you are a really good stock picker. Pick stocks that will go up, hold on to them and when they are about to go down, sell them. Don’t make mistakes.” And it’s picked a good stock, that’s gone up a lot, but has now started to go down a bit. Will it get the collywobbles?
Yes! Because “getting the collywobbles in a good trade” is exactly the same activity as “not hanging on to a loser”, except that you’re wrong. Symmetrically, “hanging on stubbornly to a loser” is the same thing as “not getting shaken out of a good trade”, except you’re wrong. Stock market proverbs are massively contradictory; let your winners ride, but pigs get slaughtered. Cut your losses, but in the long term it’s a weighing machine. The trend is your friend, but be contrarian. Etc etc.
The trick is to understand what situation you’re in, and to apply the right one of those rules. And, of course, coping with a high-dimensional environment in which different rules are applicable is one thing that neural nets can handle very well.
It’s just that the neural nets which handle it well are the ones which have been specifically trained to do so, like the actual quant models which people actually use. A neural net which has been pre-trained on a massive corpus of human language by a transformer network is just going to get the collywobbles and double down like a human analyst, because that’s what happens in its training data.
The only way to stop it from doing so, or to bring it up to the standard of the best human analysts, would be to create a great big tagged dataset of good and bad trades along with the reasoning that led to them, so as to get better pre-training weights. And then you’d take on the huge additional task of updating the tagged dataset, so that it didn’t go stale as the market and the economy changed. (Stale knowledge and old habits are a huge problem for human stockpickers). But why would you do that, rather than just directly training the neural net to pick stocks?
And this matters, because when you shift focus back from stockpicking to general management, there’s no equivalent of directly training the model on the data. And it’s not even clear that “the data” has a stable referent – management is very much a game of “horses for courses”, in which the curse of dimensionality is very great indeed.
So you’re left with a model that has to be trained on the corpus of human decisions, and which will therefore be likely to reproduce human mistakes. Unless you can hire an army of offshore workers to tag management decisions as “good” and “bad”, I suppose. But that sounds expensive.

So, I agree up to here:
"So you’re left with a model that has to be trained on the corpus of human decisions, and which will therefore be likely to reproduce human mistakes."
The conclusion you draw from this is that there would be no point to such a machine. But that seems under-supported. If the machine which reproduces human mistakes is cheaper to run than the human worker, why wouldn't a business prefer to make that switch?
(I suspect that the machine would in fact have a different distribution of mistakes than a human, and perhaps those mistakes would be so much more expensive than human mistakes [because, e.g., they seem so bizarre that nothing's set up to contain their effects] that even a very cheap machine would be a false economy. But I do think you need a steady like that in the argument.)
It's too trite to be universally true, but when I'm feeling lazy I assert that all hard problems are matters of judgement rather than of following the correct rules.