The best way to understand something is to break it here and there.

Blogs

3,467 Moves Under a Microscope

New

Every LLM move from 68 games, evaluated by Stockfish — what the data reveals about where language models reason well and where they fall apart.

Qg8#

New

89 moves, zero illegal attempts, and a checkmate — the story of the first time a language model pipeline beat Stockfish, playing at 1320 ELO.

The 3x3 Matrix

Mapping chess cognitive demands to real-world task combinations — which tasks overload language models for the same reasons chess does?

The $100 Chess Game and What Came After

How cost constraints, rate limits, and a creative use of Claude Code transformed a failing experiment into a human-AI partnership against Stockfish.

36 Games, 799 Moves, and the Shape of Failure

How the system broke across 36 games, how failures evolved, and what the data reveals about LLM chess.

Twelve Tools, Four Agents, and One Pipeline

The architecture of a system that gives a language model eyes, critics, and a structured workflow for playing chess against Stockfish.

When a Prompt Isn't Enough

The origin story of a project that started with broken LaTeX, led through cognitive load theory, and ended with a language model playing chess.

The best way to understand something is to break it here and there.

Cross-domain experiments — cognitive load theory meets chess engines, Bloom's taxonomy meets Q/A generation, GPU kernels meet ML intuition.