Writing
- Chapter 1: Tokenization — Where Language Becomes Discrete
Why tokenization is the first information bottleneck in LLMs, shaping compression, context length, multilingual behavior, and symbolic performance.
- Chapter 0: The Design Space of Language Models
A working model for understanding LLMs as compressed, lossy, differentiable databases of language.