11 Comments
User's avatar
Gerben Wierda's avatar

Small remark: tokens are generally shorter than words, not longer. The shortest tokens are indeed single characters (and punctuation marks). The longest can be complete words (though might also be parts of even longer words). The problem with the page numbers probably is that there is little statistics between token sequences and page numbers, so to be expected (it doesn't understand the text, but also not what a page number is)

While LLMs can be very useful, on NotebookLM: someone took a post from me and turned it into a video-podcast using NotebookLM. It did not go very well, but for those that did not read the post it will have been very convincing. (https://ea.rna.nl/2025/10/27/ai-generated-podcast-ai-slopcast/)

John Quiggin's avatar

Minor nitpick: I misread the head and sub-head as suggesting that it was using the term "adversarial context" that would make the organization dumber, perhaps because it was new-fangled management-speak.

Ziggy's avatar

That's something that staff functions (i.e., legal and audit) are useful for. They usually don't have a dog in anybody's fight, and know something about the business. IOW, they are perfect spies for top management--assuming that top management takes them seriously enough to ensure that they're always in the loop. I sometimes think that the main benefit of bank supervision is that it empowers staff against line.

Jeff's avatar

I wonder if we can view tech industry adoption of AI through this lens.

Mark's avatar

Ah, thanks for the suggestion on how to replace the old version of the EBA Interactive Single Rulebook in my life!

Dan Davies's avatar

It does work a treat - to the extent that I'm actually going to ask the EBA if they can make a convenient download of the entire Q&A archives. I had been worried about it hallucinating legislation but I have so far had no problems with that; it does hallucinate page numbers though.

Interesting thing it's that I thought it was going to be a game changer but it kind of isn't? It is a lot more convenient and it does mean that I can respond super quick to email questions. But it's not really speeded up my workflow at all. I think a lot of the problem is something I put on social media the other day - I am not a server rack and cannot immediately reallocate little chunks of 5 minutes of time to other productive uses.

Doug Clow's avatar

Yes, agree there’s a lot of interesting stuff in that space. There’s the phenomenon that subunits of the same company can (in theory) operate in a cooperative context rather than an adversarial one, which improves information transfer.

And in the between-firm adversarial context, there is some stuff in game theory/ competition/ auction theory, but I think there’s rich scope for a cybernetic analysis, I reckon.

Reminds me of the newbie quant phenomenon of finding a strategy that tests perfectly… until you actively trade in the market and it turns out in the adversarial environment your lunch gets eaten.

The Backseat Policy Critic's avatar

“It’s part of the central problem of management cybernetics – making sure that information arrives where it can play a part in decisions, in time to be useful and in a form where it can be accepted as input by the decision maker.”

Perfect timing with this post as I have been rereading Roger Cirillo’s seminal thesis on Operation Market Garden (arguably one of the best pieces of Second World War research around). It’s very long and goes into a lot of detail on military issues, but if you’re looking for the ultimate case study on organisational dysfunction, clashing cultures and the failure of central command systems, I have genuinely yet to find anything better: https://dspace.lib.cranfield.ac.uk/server/api/core/bitstreams/6ac6738e-a2a5-4485-930e-23bfc7d82235/content

Charlie Tangora's avatar

There was an interesting video this week relating to AI summarization: a Hacker News commenter asked Claude to go through the full text of the first four Harry Potter books and list all the spells cast, which it did quite a good job at. Another person noted that lists of all the spells in Harry Potter are readily available online and probably part of the model’s training data; so he inserted three short passages introducing new magic spells and repeated the test. None of the models he tried (including Claude, Gemini, and GPT) found the added spells.

The conclusion is that the models are not necessarily working from provided text, even when they appear to be, if the text provided was common in the training data. Instead, they’re probably being pushed into a part of the latent space containing lots of legal summaries or Harry Potter analysis and producing output that resembles existing work online.

Banged Noumena's avatar

I think for efficiency notebooklm converts various formats that you put into it into simple text that it then processes ... it gives inaccurate results because it is looking at a thing that doesn't have pages .. I don't think it's a token related problem

Dan Davies's avatar

Yes I spent a while finding that out the hard way - "extract a list of all the urls from this document" is not something it can do because it uploads in a format which doesn't include the urls