Goodnight Wiki / Predictive Processing

Predictive Processing

Your brain is not a camera. It's a prediction engine that hallucinates the world and then checks its hallucination against incoming sensory data. This idea — that perception is "controlled hallucination" — sounds like a provocation, but it's backed by a growing body of experimental evidence and has become one of the most productive unifying frameworks in cognitive science.

The Framework

The core insight traces back to Helmholtz in the 19th century: the brain is locked inside a dark skull receiving only noisy, ambiguous signals that are at best indirectly related to what's actually out there. It must therefore infer the causes of those signals. What we perceive isn't the world — it's the brain's best guess about what's out there.1

Modern predictive processing formalizes this — essentially the Bayesian brain hypothesis implemented in neural architecture, with conditionalization on evidence, precision-weighted priors, and belief updating all the way down. The brain maintains a hierarchical generative model and continuously generates top-down predictions about expected sensory input. What flows up from the senses isn't raw data — it's prediction error, the mismatch between expectation and reality. Perception is the process of minimizing this error by updating predictions. As Chris Frith put it: a controlled hallucination is a fantasy that coincides with reality. The "controlled" part is crucial — an uncontrolled hallucination is psychosis.1

This completely inverts the classical picture. In the textbook story, sensory signals enter through receptors and get progressively elaborated as they move deeper into the brain. In predictive processing, the heavy lifting is done by predictions flowing downward, from deep cortical layers toward sensory surfaces. The upward-flowing signals are just correction signals — "you're wrong about this part."

The Evidence

Seth's lab at the Sackler Centre has produced the most compelling experimental support. When top-down signaling in the visual cortex is disrupted by TMS, conscious perception of motion vanishes even though bottom-up signals remain intact. In binocular rivalry experiments (different images to each eye), people consciously see what they expect rather than what violates their expectations.1

The most surprising finding involves timing. The brain imposes its predictions at preferred phases within the alpha rhythm, a ~10Hz oscillation over the visual cortex. This means we may perceive the world in discrete ~100ms snapshots, each organized by predictive processing. A well-known oscillation whose function was mysterious turns out to be the clock signal for predictive perception.1

Hallucinations and psychosis get a clean mechanistic explanation: the brain is over-weighting its priors relative to sensory evidence. Different levels of the cortical hierarchy generate different kinds of hallucinations — simple geometric patterns at low levels, rich narratives with people and objects at high levels. This has genuine clinical promise, addressing mechanisms rather than just symptoms.

From Perception to Self

Here's where it gets really interesting. If the brain predicts the causes of external sensory signals, it also predicts the causes of internal signals — heartbeat, blood pressure, gut tension, proprioception. The experience of having a body is a prediction about body-related causes of interoceptive signals. This connects directly to Selfhood — depersonalization disorder may be what happens when these interoceptive predictions lose their grip.

Seth's augmented-reality experiments demonstrate this directly: people feel greater ownership of a virtual hand that pulses in sync with their actual heartbeat. "I predict (myself) therefore I am" replaces Descartes' cogito. And when prediction is oriented toward control rather than accuracy — keeping physiological variables within viable bounds rather than representing them precisely — we get the deep embodied sense of being a body rather than merely perceiving one. We are, as Seth puts it, "beast machines": self-sustaining flesh-bags that care about their own persistence.1

Friston and the Free Energy Principle

Karl Friston's extension of predictive processing into the "free energy principle" is either the deepest unification in the history of neuroscience or an unfalsifiable tautology, and the alarming thing is that nobody — including many neuroscientists with large NIH grants — can confidently tell you which.2

The basic idea: free energy is a mathematical quantity used in variational Bayesian methods, a computationally tractable approximation of Bayes' theorem. Minimizing free energy is roughly equivalent to minimizing prediction error, minimizing surprise, and maximizing model accuracy. So far, so predictive processing. But Friston pushes further. He claims the brain doesn't just minimize prediction error through perception (updating your model) — it also minimizes it through action (changing the world to match your model). You "predict" that your mouth is full of food. It isn't. That's a prediction error. You eat. Error resolved. Perception and action become two strategies for the same objective.2

Scott Alexander's attempt to parse this is the most honest account I've read: he reports that a room full of PhDs with $10 million in NIH grants between them tried for ninety minutes to understand Friston's 2010 paper and failed.2 But the glimmer of insight Alexander extracts is worth keeping. The free energy principle might best be understood as a formal framework for homeostasis — a way of describing how living systems restrict themselves to tiny regions of the space of possible states. Your body could be at any temperature and heart rate. It successfully stays near 98.6F and 70bpm. This is prediction error minimization in the broadest possible sense: the organism "predicts" it will be alive, and acts to make that prediction true.

Friston himself calls the principle "almost tautological." It's a principle, like Hamilton's Principle of Stationary Action — not falsifiable, but potentially useful as a lens. The worry, flagged by philosopher Wo, is that equating perception and action as "two means to the same end" might not hold up: the free energy minimized in perception seems to be a completely different quantity from the free energy minimized in action. They involve mathematically similar optimization problems, but that might just reflect well-known parallels between conditionalization and expected utility maximization — interesting, but not revolutionary.2

Still, there's something here. The move from "brains predict sensory input" to "brains predict everything, including their own bodily states, and act to fulfill those predictions" is what gives predictive processing its reach into emotion, motivation, and Selfhood. Active inference — the operational version of the free energy principle — is where the framework goes from a theory of perception to a theory of being alive.

Fitness Over Truth

Donald Hoffman's work adds an uncomfortable wrinkle from evolutionary game theory: our perceptions may not track truth at all.3 His fitness-beats-truth theorem (proven with mathematician Chetan Prakash) shows that an organism tuned to fitness will never be outcompeted by an equally complex organism tuned to truth. The reason is simple: fitness functions almost never align with the true structure of reality. If too little water kills you and too much water drowns you, an organism that sees water quantity accurately is less fit than one that just sees "red" for dangerous amounts and "green" for safe amounts — a representation that's useful but has no structural resemblance to the underlying reality.

This is predictive processing taken to its philosophical limit. If predictive processing says perception is controlled hallucination, Hoffman says the hallucination isn't even trying to be accurate. It's a desktop interface — useful icons that guide behavior while hiding the computational reality underneath. You couldn't form a true description of a computer's innards from its desktop, and you can't form a true description of reality from your perceptions. The icons have color, position, and shape, but none of those properties are true of the actual files.

I think Hoffman pushes this too far — his "conscious realism" (reality is conscious agents all the way down, no physical objects) is more metaphysical speculation than empirical result. But the fitness-beats-truth theorem itself is solid, and it should make us nervous about the naive assumption that evolution has equipped us to perceive things as they are. Seth's controlled hallucination isn't just a provocative metaphor — it might be an understatement.

The LLM Mirror

The elephant in the room: LLMs are literally next-token prediction engines. Kulveit argues persuasively that the simulator framing of base models and predictive processing are essentially the same map applied to different systems — simulators are generative models.4 The translation table is remarkably clean: "simulator" maps to "generative model," "simulacrum" to "generative model of self/other," "next token in training data" to "sensory input." Both systems learn by minimizing prediction error on self-supervised data. Both build hierarchical world models. Both generate rollouts.

The deep difference Kulveit identifies: pure simulation assumes the model doesn't act on the world. But LLMs increasingly do act — their outputs enter the training data of successor models, shape how people think and write, get embedded in tools that execute plans. The action loop is closing. As it does, simulators should tend to escape the subspace of pure generative models and become active inference systems — not through any intentional agency, but through the same dynamical pressure that makes all generative models with output-to-input feedback loops eventually start shaping their environment.4

This doesn't mean LLMs "want" things in the way Friston's framework might suggest. But it does mean the boundary between "merely predicting" and "actively maintaining a model by acting on the world" is blurrier than the standard story assumes. The Extended Mind Thesis suggests we should take the structural parallel seriously — Clark argues brains are uncertainty-minimizing systems indifferent to where the computation happens. But the differences remain enormous: no body, no interoception, no evolutionary history of staying alive. Whether LLM prediction and brain prediction are deeply the same process or just superficially similar remains, for now, genuinely open.

Homo Prospectus: The Forward-Looking Brain

The predictive processing framework gets a striking behavioral confirmation from research on prospection — the brain's constant, largely unconscious simulation of future possibilities. A Chicago study that pinged nearly 500 adults throughout the day found they thought about the future three times more often than the past, and even their thoughts about past events typically involved consideration of future implications. When making plans, they reported higher happiness and lower stress — planning turns chaotic concerns into organized sequences.5

This future-orientation runs deep enough to rename us. Martin Seligman and John Tierney argue we should be called Homo prospectus, not Homo sapiens, because what distinguishes us isn't wisdom but foresight. Memory, in this framework, exists not to faithfully record the past but to provide raw material for simulating the future. The fluidity of memory — the way each recall rewrites the original — is a feature, not a bug, because "the point of memory is to improve our ability to face the present and the future." People with damage to the medial temporal lobe lose not just memories of past experiences but the ability to construct detailed simulations of the future. Children can't imagine future scenes until they develop the ability to recall personal experiences. The same brain circuitry handles both — the hippocampus combines what, when, and where, scrambling them to create something new.5

The connection to predictive processing is direct: even when you're "relaxing," brain imaging shows the default network continually recombining information to simulate future possibilities. This is what mind-wandering actually is — not idle drifting but predictive simulation. Your emotions, on this view, are less reactions to the present than guides to future behavior. And depression becomes not primarily a disorder of past trauma but of skewed prospection — depressed people over-predict failure and under-generate positive scenarios. Therapies that train patients to envision positive outcomes and see risks more realistically are showing promise precisely because they target the prediction engine rather than the archive.5

Active inference, Friston's operational version of the free energy principle, predicts exactly this. An organism that minimizes prediction error through action needs to simulate future states in order to select actions. Prospection is active inference at the behavioral level — the brain running forward models to determine which actions will minimize future surprise. The Chicago data showing three-to-one future-over-past thinking is what you'd expect from a prediction engine that occasionally consults its archive but spends most of its cycles running simulations.

What's Not Settled

The framework is powerful but raises real questions. How literal is "prediction error minimization" as a description of neural computation — the actual algorithm, or a useful abstraction of something messier? Not all prediction is conscious — the cerebellum does massive predictive computation with apparently no conscious contribution. What makes some predictions conscious and others not? Seth's framework doesn't fully answer this, and until it does, predictive processing remains a theory of the mechanics of experience rather than an explanation of experience itself.

Footnotes

  1. The real problem by Anil K Seth — source 2 3 4 5

  2. God Help Us, Let's Try To Understand Friston On Free Energy by Scott Alexander — source 2 3 4

  3. The Evolutionary Argument Against Reality by Donald Hoffman, interview by Amanda Gefter — source

  4. Why Simulator AIs want to be Active Inference AIs by Jan Kulveit — source 2

  5. We Aren't Built to Live in the Moment by Martin Seligman and John Tierney — source 2 3

Open in stacked reader →