Goodnight Wiki / Formal Semantics as Interpretation

Formal Semantics as Interpretation

There's a beautiful structural parallel between natural languages and programming languages that most people in either field don't fully appreciate. When a linguist says a sentence has "meaning," they're describing a mapping from syntactic structure to some semantic domain — truth values, possible worlds, whatever. When a computer scientist says a program has "semantics," they're describing a mapping from abstract syntax to behavior — values, effects, computations. The striking fact is that these two mappings can be described using exactly the same mathematical machinery.1

The Core Idea

The starting point is simple: both natural and programming languages have an infinite number of well-formed expressions, so you can't list them all. Instead, you define a finite set of building blocks (words, or constructors), explain how to compose them, and provide systematic rules for translating any composition into either form (how it looks or sounds) or meaning (what it does or is true of).

A linguist draws this as:

Sound ← Syntax → Meaning

A computer scientist draws:

Concrete syntax ← Abstract syntax → Semantics

The left arrow is a pretty printer (linguists call it "phenogrammatics"); the right arrow is an evaluator or compiler. Both are interpreters — functions from abstract expressions to concrete representations. A pretty printer interprets the expression as a string; an evaluator interprets it as a truth value. The expression itself is abstract, parametric in what interpretation you choose to give it.

Chung-chieh Shan, Oleg Kiselyov, and Jacques Carette showed that this parallel is not just a metaphor — it's a formal identity. You can represent an expression in either language as a polymorphic program that invokes constructors of an abstract data type. Instantiate the type as strings and you get pronunciation. Instantiate it as truth values and you get meaning. The same abstract expression, two interpretations. In Haskell, these are different type class instances. In OCaml, they're different modules for the same signature.

Quantifier Scope as Continuation-Passing

The payoff comes when you hit scope ambiguity — the classic problem in natural language semantics. "Someone saw everyone" is ambiguous: does "someone" take wide scope (there's a specific person who saw everyone) or narrow scope (for each person, there's someone who saw them)? These readings correspond to two different abstract expressions that happen to pretty-print identically but evaluate differently.

The mechanism that handles this is continuation-passing style (CPS) — a standard technique in programming language theory. In the natural language interpretation, "everyone" and "someone" are higher-order functions that take predicates as arguments: everyone f = f John && f Mary. Computer scientists recognize this immediately as CPS. Logicians recognize a representation theorem. Linguists call it "quantifier raising." They're all describing the same mathematical structure.

This isn't a coincidence. Both natural and programming languages face the same fundamental problem: compositionality with scope. When you nest quantifiers, or nest variable bindings, or nest any kind of scope-introducing construct, you need a way to determine which scope takes priority. CPS is one of the cleanest solutions, and it was discovered independently in multiple fields because the problem it solves is universal.

Why This Matters

The practical implication connects to statistical vs. symbolic linguistics. The statistical approach to natural language (LLMs, neural machine translation) handles scope ambiguity implicitly — the model picks the more probable reading. The formal approach handles it explicitly — each reading corresponds to a different derivation. The tagless interpreter framework shows that these aren't as different as they seem. Both are interpreters of the same abstract structure; they just produce different outputs (probabilities vs. truth values).

The deeper lesson is about the relationship between language design philosophy and natural language. Chomsky's context-free grammars, invented to describe human language, turned out to describe programming languages better. But the interpretive framework flows the other direction: techniques developed for programming language semantics (CPS, partial evaluation, type-theoretic grammars) turn out to illuminate natural language in ways that purely linguistic theories missed. The arrow between fields runs both ways, and the best work lives at the intersection.

Footnotes

  1. Translations by Chung-chieh Shan and Oleg Kiselyov — source

Open in stacked reader →