Language and Thought

Does language shape thought, or merely express it? This sounds like a philosophy seminar question, but it has teeth. The evidence — from a deaf man who lived 27 years without language, from infants whose color perception rewires as they learn words, from chimps who can only grasp abstract relations after being given arbitrary tokens for them — points to something more interesting than either side of the debate expected. Language isn't a prerequisite for thought, but it's not merely wrapping paper either. It's a cognitive tool that, once acquired, fundamentally restructures what kinds of thinking are possible.

The Computational View

Andy Clark makes the case most sharply.¹ Public language, he argues, is a species of cognitive technology — an external artifact, like a compass or a slide rule, whose value lies in reshaping the computational problems that biological brains must solve. This is the Extended Mind Thesis applied specifically to words.

Clark identifies six ways language augments cognition: memory augmentation (the obvious one), environmental simplification (labels as navigational aids), coordination and reduction of deliberation (plans as stable external control structures), taming path-dependent learning (ideas migrating between minds that couldn't independently reach them), attention and resource allocation (self-directed speech as behavioral control), and data manipulation (writing as thinking, not transcription of prior thought). The last of these is the most radical: "one does not first entertain a private thought and then write it down: rather, the thinking is the writing."

The key insight isn't that language communicates thoughts — that's obvious. It's that language creates cognitive building blocks. When you attach an arbitrary label to a concept, you freeze it into something that can be manipulated as a unit. This lets you build higher-order abstractions on top of it. Words function as cognitive "chunking" — they compress complex patterns into simple objects that the brain's pattern-completion machinery can then operate on in new ways.

The chimp evidence is striking here. Thompson, Oden, and Boysen showed that chimps who had learned arbitrary tokens for "same" and "different" could solve relational-matching problems (matching relations-between-relations) that untrained chimps could not. It wasn't syntactic competence that opened this door — chimps with no compositional language training but with token-association experience performed equally well. The mere act of tagging abstract relations with concrete symbols was enough to unlock a new tier of reasoning.¹

Life Without Language

If language is a cognitive tool, what happens to someone who never acquires it? Susan Schaller's account of Ildefonso, a deaf man who reached age 27 without any language at all, is one of the only windows we have into this question.²

Ildefonso wasn't cognitively inert. He had survived, crossed international borders, navigated cities, found work. He could read social situations — he understood "macho behavior," presumably from watching enough of it. But certain things were completely opaque to him: history, immigration policy, the concept of abstract time. Schaller's hardest challenge was communicating the idea of an "idea" — which, when you think about it, requires a tower of metaphoric assumptions (ideas as objects, located in heads, transferable between people) that seem absurd once you strip away the linguistic scaffolding.

The breakthrough, when it came, was volcanic. After weeks of Schaller teaching an invisible student, Ildefonso suddenly went rigid, then started frantically pointing at everything in the room, demanding the sign for each object. "The window became a different thing with a symbol attached to it." He collapsed sobbing. At 27, he had discovered that things have names — and that names are shared, connecting him to a community of minds he hadn't known existed.

This connects directly to Helen Keller's account in Selfhood — "I found that I was something" — but Ildefonso's story adds a dimension Keller's doesn't. After learning language, Ildefonso told Schaller he could no longer communicate with other languageless deaf people he'd known. "I think differently. I can't remember how I thought." Language wasn't an addition to his repertoire. It displaced something — some non-linguistic mode of thought and communication that, once overwritten, couldn't be recovered.²

What was that mode? We can speculate — mimetic cognition, bodily reasoning, direct perceptual pattern-matching without symbolic mediation. But Ildefonso can't describe it precisely because describing requires the tool that destroyed it. It's a genuine epistemological blind spot.

Color and the Half-Whorfian Brain

The Sapir-Whorf hypothesis — that language shapes perception — has been tested most precisely with color, and the results are wonderfully weird.³

The Tarahumara of Mexico don't distinguish blue from green linguistically. When tested against English speakers, they show subtly reduced ability to differentiate blues from greens. Having a word for blue seems to make the color "pop" slightly more. But a 2006 study by Gilbert and colleagues found that this effect is lateralized: it only holds in the right visual field (processed by the left, language-dominant hemisphere). In the left visual field, linguistic categories don't help at all.³

Korean speakers, who distinguish yeondu (yellowish green) and chorok (green) as basic colors — a boundary English doesn't mark — show the same pattern in reverse: their left brain is attuned to the yeondu-chorok boundary that English speakers' brains ignore. When researchers add verbal distraction, the linguistic advantage disappears. Visual distraction doesn't have this effect.

The developmental story is even stranger. Infants actually start with the opposite lateralization — pre-verbal babies show better categorical color discrimination in their right brain (left visual field). As they learn color words, this ability migrates to the left hemisphere. Their brains are literally rewiring around linguistic categories, and the rewiring shifts which hemisphere handles the task.³

So Whorf was half right. Language shapes perception, but only in the half of your brain that processes language. The other half remains blissfully pre-linguistic, categorizing the world on its own terms. We walk around with two slightly different perceptual worlds running in parallel — one filtered through language, one not.

Private Speech: Language as Self-Regulation

If language augments thought, you'd expect the augmentation to be visible — and it is, in the most literal way imaginable. Children talk to themselves, constantly, and it turns out this isn't noise. Vygotsky argued in the 1930s that private speech is the bridge between social communication and internal thought: children first learn to guide their behavior through dialogue with adults, then internalize that dialogue as self-directed speech, which eventually becomes silent inner speech. Piaget had dismissed the same phenomenon as "egocentric" babble from immature minds. Vygotsky said it was the most important cognitive transition in development.⁴

The evidence, accumulated across decades after Vygotsky's work was finally translated into English in 1962, overwhelmingly supports him. Laura Berk's studies found that children talk to themselves more when working alone on challenging tasks, and less when teachers are available to help — exactly what you'd predict if private speech substitutes for external guidance. The speech becomes less audible as children age: first-graders make self-guiding comments out loud, second-graders mutter, third-graders move their lips silently. And crucially, children whose private speech was age-appropriate predicted better academic performance the following year — the benefit was delayed, not concurrent, because the speech helps you learn, and learning shows up later.⁴

The most counterintuitive finding involves ADHD. The assumption behind self-instructional training programs — teaching impulsive kids to "talk themselves through" tasks — was that ADHD children don't use enough private speech. Berk found the opposite: ADHD children use more audible self-guiding speech than normal children, not less. They're trying harder, not less hard. The problem isn't absence of the tool but the attention deficit preventing the speech from gaining control over behavior. Stimulant medication dramatically matures their private speech — and only when medicated does their most mature form of speech (inaudible muttering) correlate with improved self-control.⁴

This adds a developmental dimension to Clark's computational view. Language doesn't just augment cognition in the abstract — it augments it through a specific developmental sequence in which external social scaffolding gets literally internalized as a self-regulatory tool. The zone of proximal development — Vygotsky's term for the range of tasks a child can accomplish with guidance but not alone — is the space where language transforms from social technology into cognitive architecture. And the process never fully completes: adults still talk to themselves when facing unfamiliar challenges, reverting to audible private speech when the inner version can't handle the load.

What This Means for AI

The parallels to LLMs are irresistible and must be handled carefully. An LLM is, in a sense, nothing but the linguistic tool — all symbol, no pre-linguistic substrate. If Clark is right that language acts as a computational transformer on an underlying pattern-completing architecture, then LLMs are the transformer without the thing being transformed. They have the building blocks without the ground-floor experience that the blocks are supposed to compress.

This might explain both why LLMs are so surprisingly capable (the linguistic tier really is where much of abstract reasoning happens, even in humans) and why they sometimes fail in ways that feel alien (they lack the embodied, pre-linguistic cognition that Ildefonso had before language, the thing language displaced but didn't replace). Dennett suggested that linguistic bombardment installs something like a serial virtual machine on parallel neural hardware. If that's right, LLMs are running the virtual machine natively — which is a different thing entirely from simulating it on a biological substrate that has its own non-linguistic priorities.

But there's a cautionary note from Predictive Processing: the brain's prediction machinery operates at many levels, and language is just one layer of the hierarchy. Ildefonso's story suggests that the non-linguistic layers are doing real cognitive work — work that can be displaced by language rather than merely augmented. If we only study the linguistic layer, we might be making the same mistake that pre-Whorfian philosophers made — assuming that the map (language) is the territory (thought).

Magic Words: How Language Augments Human Computation by Andy Clark — source ↩ ↩²
Life without language by Greg Downey — source ↩ ↩²
The crayola-fication of the world by Aatish Bhatia — source ↩ ↩² ↩³
Why Children Talk to Themselves by Laura E. Berk — source ↩ ↩² ↩³

Linked from

Constructed Emotion
This is strikingly parallel to how language functions as cognitive technology — words as compression algorithms for complex experiences.
Constructed Languages
Behind every conlang lurks the Sapir-Whorf hypothesis — the idea that language shapes thought, or even constrains the thoughts you can have.
Distributed Cognition
Language And Thought makes the deeper case: language itself is the original distributed cognition technology.
English As Creole
That makes it a poor fit for neat classification, and a fascinating case study in how language and thought are shaped not by design but by the accidents of who conquered whom, and what they needed to say to each other the morning after.
Linguistics Overview
The linguistics section bridges to philosophy of mind through Language And Thought (language as cognitive technology that restructures what's computationally tractable).
Maps All The Way Down
*Language is a map-making technology.* Language And Thought: words freeze concepts into manipulable units, opening new tiers of reasoning.
Philosophy Of Mind Overview
Language And Thought argues that language itself is the original mind-extending technology — words as compression algorithms that open new tiers of reasoning.
Programming Languages Overview
The PL section connects to AI through Mechanistic Interpretability (transformers have a functional anatomy discoverable through the same circuit-analysis vocabulary), to philosophy of mind through Language And Thought (programming languages are cogni…

Open in stacked reader →