The Science of Reading

Here's something that sounds wrong but is well-established: we don't read words by recognizing their shapes. For decades, typographers and reading enthusiasts believed in the "word shape" model — the idea that we recognize words as whole visual patterns, recognizing the silhouette formed by ascenders, descenders, and neutral characters. It turns out this is almost entirely wrong, and the actual mechanism is stranger and more interesting.

Three Models

The history of word recognition science is a story of three competing models, each replacing the last as evidence accumulated.¹

Word shape (the oldest model, from Cattell in 1886): we recognize words as complete visual units, like recognizing a face. The "bouma shape" — the envelope around the word's outline — is the primary signal. This seemed supported by the Word Superiority Effect (we recognize letters faster within words than in isolation) and by the fact that lowercase text reads faster than uppercase.

Serial letter recognition (Gough, 1972): we read letter-by-letter, left to right, like looking up a word in a dictionary. This explained why shorter words are recognized faster, but it couldn't explain the Word Superiority Effect at all — if we process letters serially, a letter in the third position should take three times as long to recognize as a letter alone.

Parallel letter recognition (the current model): we process all the letters in a word simultaneously, and the letter information activates word candidates in our mental lexicon. Seeing "W-O-R-K" activates all words with W in position 1, O in position 2, etc., and the word with the most activation wins. This is the model most psychologists now accept.

The Moving Window

The most compelling evidence comes from eye-tracking studies using the "moving window" paradigm. Researchers replace all text outside a certain radius of the reader's fixation point with the letter x, then measure how reading speed changes as the window grows.¹

With just 3 letters visible past the fixation: 207 words per minute. With 9 letters: 308 wpm. With 15 letters: 340 wpm — the same as unrestricted reading. The relationship is linear. Our effective perceptual span during reading is about 15 letters, but we only use information to the right of fixation. If you mask the word to the left that was just fixated, reading speed doesn't drop at all.

Here's the part that kills the word shape model: reading speed is the same whether you reveal full words or just the right number of letters. Showing 9 random letters gives the same speed as showing two complete words that happen to total 9.6 letters. Word shape information adds nothing beyond what letter information already provides.

What We Actually Do When We Read

Eye movement data reveals that reading is nothing like the smooth left-to-right scan we imagine. Our eyes make discrete jumps called saccades — typically 7 to 9 letters long, taking 20-35 milliseconds. Between saccades, we fixate on a word for about 200-250 milliseconds. During a saccade, we're functionally blind. 10-15% of saccades go backwards — regressions to re-read something — and most readers are completely unaware this happens.¹

Fixations aren't random. They never land between words. They usually land just left of the middle of a word. Short function words are frequently skipped entirely. During each fixation, we're doing three things simultaneously: recognizing the fixated word, gathering preliminary letter information about the next word, and using word-length information out to 15 letters to plan where the next saccade should land.

The boundary study paradigm clinched the case for letter-based processing. Researchers change a word during the saccade (when the reader is blind) and measure how much the pre-saccade preview helped. Words that share letters with the target help more than words that share only word shape. And here's the decisive finding: when the preview is the same word but in ALL CAPS — different visual form, same letters — reading speed is identical to the control condition. We're abstracting letter identities across saccades, not visual shapes.

Why This Matters

The word shape myth persists because it's intuitive. We feel like we recognize words as wholes. And the evidence that seemed to support it — lowercase reads faster than uppercase, consistent-shape misspellings are missed more often — all turned out to have simpler explanations. Lowercase reads faster because we practice it more; misspelling detection is driven by letter similarity, not word shape similarity, as Paap, Newsome, and Noel showed in 1984.

This has practical implications for typography and design, but what I find most interesting is what it reveals about the gap between subjective experience and cognitive reality. Reading feels holistic. It's actually a massively parallel letter-processing system that runs so fast we experience it as effortless. The smoothness of the experience is the product of a complex machinery we have no introspective access to — much like how consciousness itself might be a smooth surface over a deeply non-obvious substrate.

The Science of Word Recognition by Kevin Larson — source ↩ ↩² ↩³

Linked from

Linguistics Overview
The Science Of Reading overturns the word-shape model of reading: we don't recognize words as silhouettes, we process all letters in parallel, and the smoothness of the reading experience is a product of massively parallel machinery we have no intros…