LLMs and World Models, Part 2

Melanie Mitchell

April 5, 2026

Introduction

Perhaps the most widely cited evidence for emergent world models in LLMs is a pair of studies that focus on the simple board game Othello. This second installment examines research on transformers trained on Othello, presenting evidence that these networks may encode game state information in their internal activations, while also discussing counter-arguments suggesting they rely on "bags of heuristics" rather than coherent abstract models.