Vision Evolution

What if you could rerun evolution? Not biological evolution — that would take a few billion years and a spare planet — but a computational version, where embodied agents evolve their own visual systems in simulated environments. Kushagra Tiwary and colleagues at MIT built exactly this. Their framework places agents in single-player survival games where they must evolve eyes and learn behaviors from scratch. No pre-designed cameras. No fixed sensor arrays. Just a single photoreceptor and a single neuron, looking out into the world, with mutation and selection doing the rest.¹

The results are remarkable for how closely they recapitulate real evolutionary history.

Task Shapes the Eye

The most striking finding is that the kind of visual task determines what kind of eye evolves. When agents need to navigate through a maze — orientation, obstacle avoidance, spatial awareness — they evolve distributed compound-type eyes: many small eyes spread around their body, each with low resolution but collectively providing nearly 360-degree vision. By generation 50, a typical navigation agent has 10 individual eyes with 16 photoreceptors each, scattered across its entire diameter.¹

When agents instead need to discriminate between objects — identifying food versus poison based on texture alone (colors are invisible to them) — they evolve something completely different: two forward-facing, high-resolution camera-type eyes, each with 225 photoreceptors arranged in a 15x15 grid. The compound-eye agents sacrifice acuity for spatial awareness. The camera-eye agents sacrifice peripheral vision for frontal resolution. Neither was designed. Both emerged from the pressure to survive.¹

This computational result maps directly onto what Dan-Eric Nilsson at Lund University — a collaborator on the paper — has been arguing for decades about biological eye evolution. Eyes didn't evolve from poor to perfect. They evolved from performing a few simple tasks perfectly to performing many complex tasks excellently. A sea star's eyes can't see color or fast-moving objects, but they spot coral reefs beautifully. A mantis shrimp has 12 types of color receptors but is worse at discriminating similar colors than we are. Each eye is tuned not to some abstract optimum but to the specific problems its owner faces. The simulation confirms this: the "best" eye depends entirely on what you need to see, and evolution finds the appropriate design without any guidance.¹

This connects to Convergent Evolution — Ernst Mayr counted 40 to 65 independent origins of eyes in nature. The simulation helps explain why: compound and camera eyes emerge naturally as distinct solutions to distinct classes of problems. The convergence isn't coincidence. It's the physics of information extraction from light, rediscovered by every lineage that faced similar selective pressures.

The Lens as Innovation

Here's where the simulation gets really clever. The researchers ran a two-phase experiment. In the first phase, agents could evolve everything about their eyes except optical elements — no lenses. The agents could only adjust their pupil size. Over generations, they followed the same trajectory that biological evolution took: starting with wide-open apertures for maximum light, then progressively narrowing to cup-shaped and near-pinhole designs to sharpen the image. But this hit a ceiling. Pinhole eyes give sharp images but admit very little light. At some point, making the aperture smaller hurts more than it helps.¹

At generation 30, the researchers intervened: they allowed mutations that could add optical elements capable of bending light. The initial random lenses were terrible — performance actually dropped. But by generation 50, the agents had evolved slightly convex shapes that showed early focusing behavior. Eventually they developed well-defined lenses with symmetric point spread functions, paired with larger pupils. The lens broke the pinhole trade-off, delivering both sharp vision and high light throughput. Performance jumped past the ceiling that had stopped the lensless agents.¹

This is a beautiful result because it captures the logic of a major evolutionary innovation. Lenses aren't just an improvement on pinhole eyes — they resolve a fundamental physical trade-off that pinhole eyes can't escape. The simulation demonstrates this by showing the performance plateau in lensless eyes and the breakthrough when optics become available. It's the computational equivalent of watching the Cambrian explosion in miniature: a single innovation (bending light) unlocking an entirely new design space.

Scaling Laws: Eyes Need Brains

The final experiment explored the relationship between sensory acuity and neural processing power. By varying both eye resolution and brain size, the researchers found power-law scaling relationships between task performance and neural capacity — but only when visual acuity also increased. A bigger brain with a poor eye hits a performance ceiling that no amount of additional neurons can overcome. The information bottleneck is at the sensor, not the processor.¹

This mirrors what happens in biological evolution. As eyes become more complex, brains become larger — not just because more visual processing is needed, but because better vision reveals more of the world and creates more problems worth solving. The co-evolution of eyes and brains is not a coincidence but a scaling law: sensory and computational resources must advance together or neither can realize its potential.¹

The researchers draw an analogy to scaling laws in artificial intelligence, where model performance is bounded by the interplay of parameters, compute, and data quality. Their result adds a dimension that AI scaling research usually ignores: the quality of the sensor matters as much as the power of the processor. "Increasing model size cannot overcome fundamental sensing limitations." This is an insight that should give pause to anyone who thinks scaling neural networks alone will solve embodied AI. If evolution is any guide — and the simulation suggests it is — you also need to scale the eyes.

Why Simulate Evolution?

I should be honest about the limitations. These agents live in simple game environments, not the richly textured world that real organisms evolved in. The genetic encoding, while cleverly designed with separate gene clusters for optics, morphology, and neural architecture, vastly simplifies the regulatory complexity of real developmental genetics. And the evolutionary timescales (50-100 generations) are a blink compared to the hundreds of millions of years over which real eyes evolved.

But the value isn't in replicating evolution exactly. It's in testing causal hypotheses that you can't test any other way. You can't go back to the Cambrian and tell trilobites "try evolving without lenses and see what happens." You can't rerun vertebrate eye evolution with a different starting point. Simulation lets you isolate variables: what if there were no lenses? What if brains stayed small? What if the only task were navigation? Each experiment answers a "what if" question that biological evolution answered only once, in a specific context, confounded by a million other variables.

The deeper ambition is to use these simulations as hypothesis-testing machines for vision science and, eventually, as a source of bio-inspired engineering. If computational evolution can independently discover compound eyes and camera eyes and focusing lenses — solutions that took nature billions of years to find — then it might also discover visual architectures that nature hasn't tried. The tree of artificial life might start from a single photoreceptor and a single neuron looking out into the world, and branch into forms we haven't imagined.

What if Eye...? Computationally Recreating Vision Evolution by Kushagra Tiwary et al. — source ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶ ↩⁷ ↩⁸

Linked from

Biology And Earth Systems Overview
Vision Evolution confirms this computationally: simulated agents independently evolve compound eyes for navigation and camera eyes for discrimination, recapitulating biological history.