Reflective Towers of Interpreters

Imagine a language where you can change the meaning of any program — and the meaning of the meaning, and the meaning of the meaning of the meaning — as the program is running. Each level of meaning has its own REPL, its own program counter, and each level can inspect and modify the level below it. This is a reflective tower of interpreters, and it is both one of the deepest ideas in programming language theory and one of the least known outside of a small circle of researchers.¹

The Infinite Stack

Brian Cantwell Smith introduced reflective towers in the early 1980s as a semantic model of reflection. The setup: a user program (level 0) is interpreted by a meta-level evaluator (level 1), which is itself interpreted by a meta-meta-level evaluator (level 2), and so on, potentially to infinity. Each level can reify its program into data at the level above (turning running code into an inspectable object) and reflect data from above back into running code below.¹

Smith's analogy is evocative: "It is as if we were creating a magic kingdom, where from a cake you could automatically get a recipe, and from a recipe you could automatically get a cake." Reification gives you the recipe from the cake; reflection bakes the cake from the recipe. The levels are causally connected — changing the recipe at level 1 changes how the cake behaves at level 0.

In Black, a concrete reflective tower by Kenichi Asai, this plays out practically. When a user-level program hits an error — say, calling a number as a function — control passes up to the meta level, where you get a REPL to inspect and modify the situation. You can fix a closure's environment, redefine what "application of a number" means, or install a tracing evaluator that instruments all variable accesses. The meta-level evaluator is defined piece-wise, so modifications are surgical: change base-apply to handle numbers, and suddenly (2 3 4) returns 24 instead of an error.¹

Collapsing the Tower

The obvious problem with towers of interpreters is performance: each level of interpretation adds overhead, and the overhead compounds exponentially. If you're running Python in a JavaScript VM on an emulated CPU on an ARM processor, the accumulated interpretive overhead is brutal.

Amin and Rompf's solution is to collapse the tower — transform an arbitrary stack of interpreters into a single-pass compiler that eliminates all interpretive overhead. The key insight comes from high-performance program generators: stage polymorphism. An evaluator is written so that it can act either as an interpreter (executing code directly) or as a compiler (generating code for later execution), depending on a runtime parameter. When you wire up a tower of stage-polymorphic interpreters, only the topmost layer generates code; the intermediate layers pass staging commands through without executing them. The result is a one-pass compiler from the top language to the base language, regardless of how many intermediate interpreters sit in between.²

This is more than an optimization trick. It reveals something fundamental about the relationship between interpretation and compilation. The first Futamura projection — the well-known result that staging an interpreter yields a compiler — is a special case. Collapsing towers generalizes this to arbitrary depths, including towers where each level modifies the semantics of the level below. A CPS-style interpreter becomes a CPS-converter. A tracing interpreter becomes a tracing compiler. Interpreters, when staged, become program transformers.²

Why This Matters

Reflective towers feel esoteric, but they clarify something practical: what does it mean for a language to be fully reflective? Most "reflective" languages offer a grab-bag of specific capabilities — inspecting classes at runtime, eval-ing strings as code, intercepting method dispatch. Reflective towers provide a uniform model: every aspect of the language's semantics is accessible as data at the meta level, modifiable, and reflectable back into running code. Everything else is a special case of this.

The connection to Forth is worth noting. Forth's IMMEDIATE words and CREATE/DOES> achieve a limited form of the same thing — they let you modify the compiler from within the language being compiled. The difference is that Forth's meta-level is ad hoc (you're directly manipulating a code array), while reflective towers provide a principled, compositional framework. Similarly, Lisp's macro system operates at compile time on syntactic structures, but it doesn't give you access to the evaluator's environment or continuation — it's reflection without the tower.

The collapse result also connects to compiler correctness. If we can prove that collapsing a tower preserves semantics, then we have a principled way to derive verified compilers from interpreters. Write an interpreter (which is easy to get right), then collapse it into a compiler (which is hard to get right but is now derived automatically). This is essentially what Amin and Rompf demonstrate for their Pink and Purple languages, and it suggests a research program: instead of writing and verifying compilers directly, write and verify interpreters and derive compilers from them.

The philosophical dimension is that reflective towers make explicit something that's usually hidden: the infinite regress of evaluation. When you run a program, something evaluates it. What evaluates the evaluator? Something else. What evaluates that? In practice, we stop at hardware and wave our hands. Reflective towers face this regress honestly and tame it with a default interpreter that kicks in wherever the tower hasn't been explicitly modified — a lazy, on-demand realization of the infinite stack. It's the programming language equivalent of turtles all the way down, except you can actually inspect each turtle and tell it to do something different.

Reflective Towers of Interpreters by Nada Amin — source ↩ ↩² ↩³
Collapsing Towers of Interpreters by Nada Amin and Tiark Rompf — source ↩ ↩²

Linked from

Programming Languages Overview
The convergence of compiler correctness and proof assistants suggests a future where verified compilation is derived from verified interpretation through reflective towers — write the interpreter, collapse it into a compiler, and the verification fol…

Open in stacked reader →

Reflective Towers of Interpreters

The Infinite Stack

Collapsing the Tower

Why This Matters

Footnotes

Linked from