Goodnight Wiki / Decision-Theoretic Paradoxes

Decision-Theoretic Paradoxes

The great paradoxes of decision theory aren't puzzles waiting for solutions. They're stress tests that expose foundational cracks in how we think about probability, rationality, and what it means to make a choice. Each one takes a seemingly obvious principle — "bet according to expected value," "update on evidence," "never accept a sure loss" — and constructs a scenario where following that principle leads somewhere absurd.

The St. Petersburg Paradox

A coin is tossed repeatedly until it lands tails. The pot starts at $2 and doubles with each heads. You win whatever's in the pot when tails appears. What's a fair entry fee?

Expected value says infinity. Each possible outcome contributes exactly $1 to the expected value (probability 1/2^n times payout 2^n = 1, for all n), and you're summing infinitely many ones. So rational-agent theory, naively applied, says you should mortgage your house to play this game. But nobody would actually pay more than about $20.1

The standard resolution — Daniel Bernoulli's diminishing marginal utility — is unsatisfying. Yes, your 10-billionth dollar matters less than your first, but you can construct a Super St. Petersburg game with payoffs calibrated to any concave utility function that restores the divergence. Bernoulli's fix patches one game while leaving the underlying problem intact.

The real resolution, as Johannes Koelman argues following Hugo Steinhaus's 1949 paper, comes from statistical physics: the ensemble average (expected value) is the wrong kind of average.1 What matters is the time average — what happens when you play repeatedly. The average gain from the first 2^n games grows as n+2, not as infinity. This is because the enormous payoffs that drive the expectation to infinity are visited only logarithmically slowly. You'd need to play billions of games before the million-dollar outcomes start contributing meaningfully to your running average.

The technical term is non-ergodicity. Ergodicity is the assumption that time averages equal ensemble averages — that what happens to one player over many games equals what happens to many players in one game. Most familiar gambles are ergodic, which is why expected value usually works. St. Petersburg breaks ergodicity. The expected value is formally correct but practically meaningless because no finite agent will ever play enough games for the large-n terms to matter. The divergence is real in the mathematical sense and irrelevant in every practical sense.

This connects to a broader theme in Bayesian Epistemology: the choice of which average to compute isn't a mathematical question but a modeling question. If you're a casino running the game a million times, the ensemble average is relevant. If you're one person playing once, the time average (or more precisely, the growth rate of your wealth) is what matters. The "paradox" dissolves once you realize that "expected value" is a specific tool appropriate for specific situations, not a universal law.

Sleeping Beauty

You're put to sleep. A fair coin is flipped. If heads, you're woken on Monday. If tails, you're woken on Monday, memory-wiped, and woken again on Tuesday. Each time you wake, you're asked: what's the probability the coin landed heads?

Halfers say 1/2 — the coin is fair, you learned nothing new, so the probability shouldn't change. Thirders say 1/3 — if Beauty were a betting agent, she'd encounter twice as many tails-wakings as heads-wakings, so she should bet as if the probability is 1/3.

Chris Leong's analysis argues persuasively that both answers are "valid" under different extensions of standard probability theory, and that the real question is which extension you're using.2 Standard probability theory doesn't have a concept of "now" — it doesn't natively handle situations where you're asked the same question multiple times and might not know which asking you're in. The thirder answer requires extending probability to be "consciousness-state centered," counting each waking as a separate epistemic position. The halfer answer requires normalizing across multiple queries of the same person. Neither extension is objectively correct; they're different tools for a situation classical probability wasn't built for.

Radford Neal's approach — condition on all available information, including the random sequence of experiences between waking and being interviewed — technically works but produces probabilities that depend on how much randomness you pre-commit to observing. If you pre-commit to answering only on seeing a specific experience sequence, you get something near 1/3. If you pre-commit to answering on every sequence, you get 1/2. The fact that your "probability" for heads changes based on an arbitrary pre-commitment should raise red flags. It means you're not computing what you thought you were computing.

The deepest insight from the Sleeping Beauty debate is methodological: when a probability problem generates persistent disagreement among competent reasoners, the problem is almost always that the question is ambiguous, not that one side is making a mathematical error. "What is the probability of heads?" sounds like a crisp question. It's not. It's two different questions wearing one sentence.

Dutch Books and Coherence

The Dutch Book argument is the traditional justification for probabilism: if your degrees of belief don't satisfy the axioms of probability, a cunning bookie can construct a set of bets that guarantees you a net loss.3 Since you don't want guaranteed losses, you should have coherent credences. QED.

The argument is less airtight than it appears. As the Stanford Encyclopedia's extensive analysis makes clear, the gap between "your beliefs violate the axioms" and "you actually lose money" is wide.3 You might never encounter a cunning bookie. You might refuse to bet. Even if forced to bet, the bookie might reverse the direction and guarantee you a win — the so-called Czech Book argument. Being incoherent doesn't mean losing; it means being exploitable in principle, and the step from exploitable-in-principle to irrational-in-practice requires more argument than is usually provided.

Depragmatized versions of the Dutch Book argument try to avoid these issues by casting coherence as a consistency constraint rather than a practical one. The idea is that incoherent beliefs are inconsistent in the same way that contradictory beliefs are inconsistent — not that they'll cost you money, but that they amount to giving conflicting evaluations to the same option. Armendt calls this "divided-mind inconsistency." But even this doesn't quite work: you can violate coherence by having less-than-full confidence in a logical truth (rational if you haven't proven it yet), and the consistency analogy between full beliefs and partial beliefs is shakier than it looks.

What the Dutch Book argument really establishes is something more modest: in highly constrained situations (forced betting, competitive opponents, linear utility), having coherent credences is prudent. This is useful but much weaker than "all rational agents must have probabilistic beliefs." The probabilistic framework is a tool, not a metaphysical fact about the structure of rational minds. It's the right tool for many situations. It's a mediocre tool for Sleeping Beauty and a meaningless tool for St. Petersburg. Knowing which situations it fits — this is Calibration And Measurement at the meta-level — is more important than knowing the axioms themselves.

Functional Decision Theory and Its Discontents

The Newcomb problem and its cousins have inspired increasingly exotic decision theories, most notably Yudkowsky and Soares's Functional Decision Theory (FDT). Where Causal Decision Theory asks "what would happen if I chose A?", FDT asks "what would happen if the right answer were A?" — evaluating actions by supposing that all agents running the same decision algorithm would choose the same way.4

FDT recommends one-boxing in Newcomb's Problem, cooperating in the Prisoner's Dilemma with a twin, and refusing to pay blackmailers. The appeal is clear: FDT agents end up richer than CDT agents in these scenarios. Wolfgang Schwarz's referee report, which he published after the paper was rejected from an academic journal, offers the sharpest critique of why the theory nonetheless has deep problems.4

The first issue is that FDT's recommendations can seem genuinely insane in specific cases. You're being blackmailed for $1 — the cost of paying is trivial, the cost of refusing is ruinous. FDT says don't pay, because if you were the kind of agent who doesn't pay, you probably wouldn't be blackmailed in the first place. But you are being blackmailed. Not being blackmailed isn't on the menu. Schwarz's verdict: "I say you'd be insane not to pay the $1."4

The deeper technical problem is that FDT requires reasoning about counterpossible suppositions — what would be the case if a mathematical function that actually outputs A instead output B. We have no good models for this kind of reasoning. Under the standard rules of mathematical logic, anything follows from a false mathematical supposition, which would make FDT trivially incoherent. And FDT's evaluation of acts can depend on consequences in other decision problems faced by other agents running the same algorithm, leading to bizarre results like the Procreation case: FDT says you should have children even though it will make your life miserable, because if FDT recommended against procreation, your father (who also used FDT) might not have had you.4

Arif Ahmed's "Betting on the Past" thought experiment cuts from a different angle and may be even more troubling for non-EDT theories.5 You're offered two bets on a proposition P, where P is "the past state of the world was such as to cause you to take Bet 2." Bet 1 strictly dominates — it pays better in every state of the world. But the moment you choose Bet 1, you learn P is false (the past caused you to take Bet 1), and you've lost. If you'd taken the dominated Bet 2, you'd have won.

This is essentially a Newcomb problem where the "predictor" is the causal history of the universe, which requires no supernatural powers — just determinism. Ahmed argues this shows that even Logical Decision Theories that handle standard Newcomb cases may fail when the correlation between your choice and the world-state is purely physical rather than algorithmic. The implication is radical: maybe simple Bayesian conditioning on your own actions (Evidential Decision Theory) was right all along, and the elaborate machinery of causal and logical decision theories was solving the wrong problem.5

The Pattern

What unites these paradoxes is that each one reveals a hidden assumption in our formalization of rationality. St. Petersburg shows that expected value assumes ergodicity. Sleeping Beauty shows that probability assumes a unique epistemic position. Dutch Books show that coherence assumes a particular relationship between belief and action. When those assumptions hold, the formalisms work beautifully. When they don't, the formalisms produce paradoxes — not because reality is paradoxical, but because the map has edges.

The rationalist response to these edges is the same as the response to any Maps And Territories problem: notice the edge, don't pretend it isn't there, and build a better map. This is genuinely hard. Three centuries of brilliant mathematicians haven't fully resolved St. Petersburg. But the direction of progress is clear: away from universal formal principles and toward a more nuanced understanding of which tools work where.

Footnotes

  1. Statistical Physics Attacks St. Petersburg: Paradox Resolved by Johannes Koelman — source 2

  2. Sleeping Beauty Not Resolved by Chris Leong — source

  3. Dutch Book Arguments by Stanford Encyclopedia of Philosophy — source 2

  4. Review of Yudkowsky & Soares's Functional Decision Theory by Wolfgang Schwarz — source 2 3 4

  5. Betting on the Past by Arif Ahmed by Johannes Treutlein — source 2

Open in stacked reader →