21 Comments
User's avatar
Sam Tobin-Hochstadt's avatar

I think you are misunderstanding what's going on with the tokenmaxxing firms. It's a combination of the following things:

1. Having a token leaderboard was a really bad idea.

2. Firms had not budgeted for the cost of each programmer using 10% of their salary in tokens, even if that makes each one 2x as productive. This will require reorganization rather than just telling everyone to use the tools.

3. One of the great temptations of these tools is to do way too much, because it's really cheap in terms of your time. That's true in number of projects, in terms of over engineering, in terms of running 100 unnecessary experiments, etc. And so some employees have to be capped to prevent them wasting company money.

Dan Davies's avatar

I thought that was my understanding of what happened - but the reason that firms went token-happy was that they thought the bottleneck was Luddite employees and that the exchange rate between tokens and human judgment would be favourable enough to make it worth their while. They didn't just fail to budget out of forgetfulness

Doug Clow's avatar

I'm more cautious about drawing the general lesson here. My hunch is this is the whole system reorganising now the distinction between "the marginal cost of AI tokens is very low" and "AI tokens are free" is becoming salient. We saw this with the dotcom crash and later industry shakeups when the marginal price of data storage and transmission collapsed - but not to zero.

Doug Clow's avatar

I also get the sense (from the edges rather than being deep in frontier model building) that getting AIs to think sensibly about how they spend tokens is at (what is likely to prove) a very early stage, and I doubt we are out of road in terms of making it possible for AI users to be able to more deftly control how much tokens their prompts consume.

Dan Davies's avatar

Two or three people have now made this interpretation so it is clearly my fault, but I don't think I was saying that everything's going to collapse or that token costs can't be brought under control. What I think is important is that the maximalist predictions dependent on them being too cheap to meter - if they are a resource to be allocated to different objectives, then you need something which can decide on the importance of those objectives (which might not be impossible to automate but which I would say is a fair couple of generations away)

Doug Clow's avatar

Yes - I think we agree that tokens (as an arbitrary unit of compute) are not zero cost and never will be, and that thus decisions will always need to be made about how to allocate that resource. And thus the question is whether AI can get good - or at least, good enough - to make those decisions either as well as humans, or less well but more scalably, in the worse-is-better kind of way. (My hunch is we'll get the latter first.) I think we also agree that we are not at that point in AI development yet. But a couple of generations away in frontier AI is not that long.

Dan Davies's avatar

I think they need to be not only "almost as good but scalable" but "almost as good, scalable and not unreasonably expensive themselves"

The Intelligence Arb's avatar

Hey Dan, would love to chat about tokenomics and any thoughts regarding how compute can be refined efficiently to useful work.

We are working on building a model and would love your expertise.

Dan Davies's avatar

Always happy to chat but I'm not sure I have much more to add!

KJZ's avatar

"which seems to confirm the general picture that everyone had of the economics of the big AI firms" Not quite "everyone" – Zitron himself has denied this many times. As recently as this March he listed "They’re profitable on inference" as an example of a "myth". Which is why it's funny that he is the source of this new data.

Crapotkin's avatar

"The race is over and John Henry won against the steam hammer."

Is that a permanent state of affairs, though? Surely the natural next step is to reallocate research from God-in-a-box to boring efficiency improvements.

Matt's avatar

All well and good (really, some excellent points). But you're falling into the usual American trap of looking down.

US AI models are like your cars and trucks. Big. Expensive. "Better".

A few decades back I read a great book about international marketing. It talked about how in the US, a large country of many ethnicities, religions and cultures, marketing tended to the lowest common denominator - bigger, faster, cheaper. In smaller, older nations marketing can play to humour, tradition, private jokes only the locals understand, etc.

I'm currently employed to make AI useful to my employers. I'm not sure how well that's going, but one thing I've found is that with the right approach, the non-US models are just as applicable to solving many issues. But the really important part isn't whether they're 5% better or worse at some benchmark, or even that they're open source so anyone can run them on whatever infrastructure they have available. It's that they are far, far more resource efficient - even doing a passable job on a $300 phone rather than a $5000 GPU (if you can find one) - and if I can't be bothered with that, I can buy them online at a cost that can be over 25x lower.

That means even if they're just a bit dumb, I can use them 25 times more often on all the dumb problems I'm trying to solve. And, in all honesty, the hard problems are typically too hard for the best, most expensive models I can buy anyway.

This is a cultural difference. When you sanction China and don't let it access the best GPUs (which are made 90 miles away in Taiwan), do you think the Chinese say "ok, back to the subsistence agriculture then", or do they, like the British in WW2, say "I've got no money, no time and no choice - I have to find a better way"

This is why the predicted death of AI is premature. Most of us aren't American, and bigger, faster, expensive, better isn't actually a good marketing technique, or a way to build a resilient technology.

Dan Davies's avatar

I think we're talking at cross purposes here because I'm not predicting the death of AI, far from it. What I think is dead is the idea that AI will replace all (or even most, or even "macroeconomically significant numbers of") professional and managerial class employees. I don't think anyone has said that small models running on phones could do that, but a lot of people said it about tokens from the big frontier models

Dave Peticolas's avatar

I think the John Henry analogy might be a bit flawed then. If companies are telling people to use AI sensibly, that's more like John Henry and his bosses realizing that John Henry should be driving the steam hammer...

Matt's avatar

I'm sorry if I mischarachterised your post.

But you're doing it again!

"I don't think anyone has said small models running on phones...". Oh yes they are!

1123581321's avatar

Excellent piece; just want to comment on this one bit that caught my eye because it often goes unchallenged ("Musk said so therefore there's something there" being the unstated back-of-the-mind assumption):

"[...] quite possibly orbital data centres [...]"

No. These are "possible" in a sense that one can be made and sent up there (so's a Tesla), but they don't make any bloody sense even if the launch costs fall to 0. Where do I start!

- There's nowhere to dump the waste heat

- There's nasty radiation that needs to be kept away from the sensitive bits

- Things fail and need to be maintained and replaced - now we need to send humans in spacesuits to do an equivalent of delicate plumbing. The gloves tend to be... awkward.

- Data transmission is a bit of a problem.

That's just off the top of my head. There are problems nobody even thought about because they will only show up once these things are being sent to the orbit.

Les Barclays's avatar

"Maybe it isn’t airlines I should be thinking of; maybe it’s something like nuclear fusion, where there’s a gap between what’s scientifically conceivable and in a sense possible, and what’s economically viable, given the amount of capital investment needed and the long term profitability of the equilibrium."

I haven't likened it to a specific industry but rather an oligopoly where the few dominant players make most of the revenue, and everyone else competes away thin margins. Even measuring ROI on tokens is hard as it's such an arbitrary unit (covered all of this in my post titled 'What is the Return on Tokens?'). I think we'll face an 'ROI reckoning' and corporates will adjust accordingly, it's kind of happening already - see this FT article: https://www.ft.com/content/1d37cc08-e0aa-45a4-a45d-4ad282529314?syn-25a6b1a6=1

In terms of reducing token spend, we're going to see a rise in open-weight models (particularly Chinese ones) and we'll hear about buzzwords like model routing which is a fancy way of saying LLMs will route your prompt to an applicable model depending on what your query/task is. Also, as open-source rises in popularity, the US national security apparatus will sandbag and/or outright ban access to Chinese open-weight models - this is in light of z.ai's GLM 5.2 being on par with Claude Opus 4.8, GPT 5.5 & Gemini 3.1 Flash despite it being open-source. Banning open-source will be a huge blunder if it happens.

skybrian's avatar

It seems too early to declare victory for John Henry. Companies using tokens in wasteful ways should stop doing that, but after someone works out how to use them more sensibly, we don’t know how many people will be needed. More sensible approaches might scale well and become commonly used.

Perhaps the pace of change will be slower due to needing more time for figure out how best to use it?

Maybe what’s failed is a certain kind of magical thinking that assumes that if you give employees AI, within weeks, they will work out for themselves how to use it effectively? it might be more difficult than that.

Edwin Roorda's avatar

"But maintaining a cartel is a tricky business, and looking at the personalities involved, I am not sure they are up to it."

Wonderful comment on the competitive dynamics of the driving personalities and the associated organization behavior.

It's not cooperation, but dominance on display.

mike harper's avatar

What would happen to Ai if some smart geek invented desk top Ai? Brad Delong has written about his. There is a huge amount of computing power lying unused in peoples desks, pockets and purses.

Philip Koop's avatar

"If you need a human being in the loop to decide on the allocation of AI tokens, then all those predictions of mass redundancy are gone."

The pithiest formulation I've seen, and the essence of the matter. I can't remember whether you've commented on Narayanan and Kapoor's "decide-execute-deliver" sandwich (https://substack.com/@aisnakeoil/p-201537309). One limitation of pinning hopes on coding agents is that *coding* (the middle of the sandwich) takes up only a minority of a developer's time. If you could somehow reduce the marginal costs of coding agents, including supervision, to zero, well that would be *valuable*, but it wouldn't double productivity. But if you could apply AI to the other parts of the sandwich, then it would be nonsensical to focus on the effects on the *software* industry; AI would have the maximalist's effect on all aspects of human endeavor.