Looped language model training cannot control hidden-state norm growth because RMSNorm normalizes scale away before the loss sees it. A paper posted today on arXiv identifies this readout blind spot, ...
When a standard large language model (LLM) is confronted with a problem, it tries to solve it by matching it to similar information it has seen before, and then give an answer based on those past ...