Discussion about this post

User's avatar
Gerben Wierda's avatar

Small remark: tokens are generally shorter than words, not longer. The shortest tokens are indeed single characters (and punctuation marks). The longest can be complete words (though might also be parts of even longer words). The problem with the page numbers probably is that there is little statistics between token sequences and page numbers, so to be expected (it doesn't understand the text, but also not what a page number is)

While LLMs can be very useful, on NotebookLM: someone took a post from me and turned it into a video-podcast using NotebookLM. It did not go very well, but for those that did not read the post it will have been very convincing. (https://ea.rna.nl/2025/10/27/ai-generated-podcast-ai-slopcast/)

John Quiggin's avatar

Minor nitpick: I misread the head and sub-head as suggesting that it was using the term "adversarial context" that would make the organization dumber, perhaps because it was new-fangled management-speak.

6 more comments...

No posts

Ready for more?