Goodnight Wiki / Log-Structured Data

Log-Structured Data

Here's a heretical thought: the database as we know it might be inside-out. For decades, the canonical architecture has been stateless application servers talking to a mutable database — a giant pile of shared mutable state that we've been taught to treat as the ground truth. But there's an older, simpler abstraction hiding inside every serious database: the log.

Martin Kleppmann noticed something interesting about database replication.1 When a leader node replicates a write to its followers, it doesn't send the original UPDATE statement (imperative, destructive, order-dependent). Instead, it sends a logical log — an immutable record of what changed, when, and from what to what. The update "change customer 123's quantity from 1 to 3" becomes the fact "at time T, customer 123 changed their quantity of product 999 from 1 to 3." The imperative statement destroys information; the log entry preserves it. Even if the customer later removes the item, the fact that this change occurred remains true forever.

This isn't a minor implementation detail. It's a fundamental shift in how we think about data.

Everything Is Derived

Kleppmann's key insight is that secondary indexes, caches, and materialized views are all the same thing: derived data. A secondary index is a redundant data structure that restructures the base table for faster lookups along a different dimension. A cache is the result of transforming database records through some business logic and storing the result for faster reads. A materialized view is a query result that's been precomputed and written to disk. In every case, the derived data can be rebuilt from scratch using the original data plus a transformation function.

Databases handle indexes beautifully. You write one line of SQL and the database builds it, maintains it transactionally, and even lets you CREATE INDEX CONCURRENTLY while writes keep flowing. But application-level caching? A complete mess. Brittle invalidation logic, race conditions when two processes update the database in one order and the cache in another, and cold-start nightmares when memcached reboots and every request becomes a cache miss.

Why should these two be so different? They're both derived data. The difference is that the database manages indexes as first-class citizens of its internal architecture, while caches are managed by ad-hoc application code that has to reason about consistency manually. Kleppmann's proposal: treat the transaction log as the source of truth, and derive everything — indexes, caches, materialized views — as continuous transformations of that log.

The Log as First-Class Citizen

Apache Kafka embodies this idea. It's an append-only, distributed, durable commit log that can handle millions of writes per second. You write events to topics, and consumers process those events to build whatever derived views they need. The log is the database; everything else is a cache.

This architecture inverts the usual relationship between the database and the application. In the traditional model, you write to the database and then try to keep your caches consistent. In the log-centric model, you append facts to the log, and downstream processors — materialized views, search indexes, analytics aggregations — consume the stream and maintain themselves. Cache invalidation becomes a non-problem because the cache is just a consumer that rebuilds itself from the log.

The idea has deep roots. Event sourcing, where you store a sequence of domain events rather than current state, has been advocated by the DDD community for years. Accounting has worked this way forever — the ledger is an append-only log of transactions, and the balance sheet is a derived view. Even version control systems like git are log-structured: the repository is a DAG of immutable commits, and the working directory is a materialized view.

Discord's Migration: A Case Study

Discord's journey with message storage illustrates both the power and the practical challenges of data architecture at scale.2 They started with MongoDB, migrated to Cassandra for scalability, and eventually moved to ScyllaDB — each transition driven by specific pain points.

Their Cassandra cluster grew from 12 nodes storing billions of messages to 177 nodes storing trillions. The problem wasn't the data model but the operational reality: hot partitions (a popular channel flooding a single node), compaction backlogs, JVM garbage collection pauses, and cascading latency. The team spent weekends performing "gossip dances" — taking nodes out of rotation one by one to let them compact without traffic load.

The architectural insight that saved them wasn't just switching databases. They introduced "data services" written in Rust — intermediary services sitting between the API and the database cluster. The killer feature was request coalescing: when multiple users request the same data simultaneously, only one database query fires, and all requesters subscribe to its result. Combined with consistent hash-based routing (all requests for the same channel go to the same service instance), this dramatically reduced database load.

This is the log-structured worldview applied to reads: instead of each request independently hitting the database, you funnel them through a layer that deduplicates and aggregates. The data service acts like a materialized view of the database, maintained in real-time by the stream of incoming requests.

Monotonicity and Eventual Consistency

The log-structured approach connects to a deeper theoretical question: what operations are safe to perform on replicated data without coordination? The answer turns out to hinge on monotonicity — the property that increasing the input to a function can only increase its output.3

CRDTs (Conflict-free Replicated Data Types) exploit this property. A CRDT is a data structure where all operations are monotone with respect to some partial order, which guarantees that replicas converge without coordination. The simplest example is a grow-only counter: each replica tracks how many increments it has performed, and merging replicas just takes the componentwise maximum. The counter can only go up, so replicas can never disagree about whether a threshold has been crossed.

The Monotonicity Types work at Northeastern tries to bring this reasoning into the type system itself. If you could prove at compile time that a function is monotone, you'd know it's safe to use in a coordination-free distributed computation. This is ambitious — monotonicity is a relational property involving multiple applications of the same function, which traditional type systems aren't designed to verify. But the underlying idea is sound: the log-structured worldview works because append-only data is inherently monotone. You can only add facts, never retract them. (Retraction is modelled as adding a new fact that cancels an old one.)

Tradeoffs

The log-structured approach isn't free. Append-only logs grow without bound, so you need compaction strategies. Rebuilding derived views from scratch can be slow for large datasets (Discord's Rust-based migrator moved trillions of messages at 3.2 million per second, which still took days). And the mental model shift is real — developers used to thinking in terms of "update this row" need to learn to think in terms of "append this event."

There's also a tension between log-structured thinking and Substructural Type Systems like Rust's ownership model. In a world of immutable facts, garbage collection is natural — you discard facts older than your retention window. In Rust's world of owned values and borrowing, the immutability is at the language level, not the data level. The two worldviews complement each other but don't always compose easily.

Still, the core insight is powerful: if your source of truth is an ordered sequence of immutable facts, and everything else is derived from that sequence by deterministic transformations, then you've turned a distributed systems problem into a data pipeline problem. And data pipelines, while not simple, are much easier to reason about than distributed mutable state.

Footnotes

  1. Turning the database inside-out with Apache Samza by Martin Kleppmann — source

  2. How Discord Stores Trillions of Messages by Discord Engineering — source

  3. Monotonicity Types: Towards A Type System for Eventual Consistency by Kevin Clancy — source

Open in stacked reader →