Illustration by Thomas Porostocky
Language is an innate faculty, rather than a learned behavior. This idea was the primary insight of the Chomskyan revolution that helped found the field of modern linguistics in the late 1950s, and its implications are both simple and profound. If innate, language must be genetic. It is hardwired within us from conception and evolved from structures and genes with analogues existing throughout the animal kingdom. In a sense, language is universal. Yet we humans are the only species with the ability for what may rightly be called language and, moreover, we have specific linguistic behaviors that seem to have appeared only within the past 200,000 years—an eye-blink of evolution.
Why are humans the only species to have suddenly hit upon the remarkable possibilities of language? If speech is a product of our DNA, then surely other species also have some of the same genes required for language because of our basic, shared biochemistry. One of our closest relatives should have developed something that is akin to language, or another species should have happened upon its attendant advantages through parallel evolution.
A quasi-paradox has persisted within the field of linguistics, because the sudden emergence of such a complex, limitless system in a single species is hard to rationalize in terms of standard evolution. Its rapid spread makes language seem more like a viral epidemic that swept through the human population rather than a trait inherited through the typical dynamics of evolution.
Luckily, two recent advances have made it possible to rigorously address the problem of language’s evolution for the first time. Molecular biology (including the publication of the human genome) and the so-called evo-devo paradigm now permit us to establish new and often quite unexpected connections among very different species. In addition, linguists’ understanding of syntax—how words are strung together into grammatical sentences—has developed to the point where language can be broken down into its basic procedural components. These components can now be seen to resemble traits observed in other species—with functions that appear to be completely unrelated to familiar thought processes. Language may indeed be unique to humans, but the processes that underlie it are not.
What we are beginning to see is that a set of disparate cognitive traits lends credence to the fact that language is genetic, and arose suddenly. Knot-tying, dancing, and typing, for instance, are all part of the unique equation that gave rise to language. But the genetic underpinnings of speech, and the machinations of its evolution, are best found in its analogues in the nonspeaking animal kingdom. Our closest relative, the chimp, would be the most natural species to look to first, but it can teach us only part of what we need to know. There seems to be a better set of species that can tell us the complete story of human language’s evolution: songbirds.
To appreciate why songbirds such as finches, which diverged from mammals over 300 million years ago, will help to elucidate the nature of language, we must first look at the story that brought genetics and language together. In the 1990s, a family was discovered in London with a certain impairment that appeared to be specific to language. Afflicted members had no discernible decrease in intelligence, but there was definitely something askew with their ability to speak. Genetic studies on the family lead to the discovery of a single affected gene responsible for the impairment, referred to as FOXP2, which was soon popularly baptized “the language gene.”
We now know, however, that FOXP2 is just one of several regulating factors—genes that turn other genes on and off—and that it is involved in many processes that have nothing obvious to do with language, such as the formation of the gut. Even as it relates to the brain, FOXP2 appears to help regulate many activities that may or may not be strictly linguistic, such as the ordered muscle commands you need to be able to tap your fingers. Most damning for its role as the language gene, variants of FOXP2 have been discovered in just about every other organism all the way down to yeast. In fact, the protein it regulates is remarkably unchanged when compared across all species, and yet no other animal has co-evolved language.
To some researchers this demonstrates that language is not really innate at all. The fact that genes we share with yeast are directly implicated in language shows that we are using very general cognitive mechanisms to learn language, not genes in any way specific to it. FOXP2 could merely be regulating sequential movements. A mutation in the gene might create a minor deficiency in the control of facial muscles, which in turn leads to, basically, slurred speech. Yet all of these facts are what can be used to show FOXP2’s primary role in human language.
FOXP2 evolved to its present form roughly 200,000 years ago, and perhaps as recently as 120,000 years ago. The era in which this change transpired was a very interesting one: It is when the explosion of modern human behaviors appears in the fossil record, and was the likely time of the last migration out of Africa.
Is it possible that this mutation simply produced better-sounding, less-slurred speech? It seems unlikely. What show up in the archeological record around this time are drastically different human behaviors. For instance, Marta Camps and I have shown that there is no evidence for the ability to tie and untie knots prior to the FOXP2 mutation. No other animals besides humans can tie or untie knots, and based on its computational complexity, it seems likely that this ability is a parasite, piggybacking on the mechanics of language. It was knots which then gave us the means to produce footwear, arrows, jewelry, and other similar objects not observed prior to this time. The mutation in FOXP2, therefore, came in concert with some deep advances in cognitive abilities that are unique to humans.
Another reason for believing in the linguistic import of the FOXP2 gene is that its observable phenotype has been discovered to go beyond just clarity in speech. For example, affected members of the London family have difficulties processing embedded sentences, such as relative clauses, and cannot communicate number expression or tense in verbs. These are not just slips of the tongue. Modern syntax holds that these processes are at the very core of the language faculty.
But for a gene underlying such uniquely human behavior, FOXP2 also has some rather pedestrian tasks. It regulates aspects of embryogenesis related to the heart, lung, and brain in scores of species. Yet none of these roles appear to affect humans’ ability to talk, nor does our ability to speak seem correlated with our hearts’ development. We need to understand why two perfect copies of the gene do whatever work they do for language, and whether that role can somehow emerge from the interaction with more basic structures. This is key to what language is. Luckily, we can look to other species to find it.
Finches have long been the subjects of behavior studies because they exhibit a form of acquisition related to vocal learning. Young male finches acquire the songs they will use later to woo females by imitating other males during a critical period of a few weeks after hatching. Indeed, brain studies have shown the finch’s brain has two important circuits for singing, one for acquisition and another for performance. Interestingly, FoxP2 (lowercase for other animals’ variants) is expressed for both circuits in a brain region called Area X while the finches are first acquiring their songs and also when they later begin to sing.
Finches do not just imitate one another; they are creative, and compose parts for their own songs. Again, this is not speech in any sense of the term, and to the best of our knowledge these songs have no patterns of meaning. This behavior does however share some abstract properties with language. Their songs reflect a type of grammar with so-called trills and flourishes. Could the fact that FoxP2 regulates it be merely accidental?
A very recent study has shed light on the fact that it is not. The mRNA of the finch’s FoxP2 down-regulates in Area X as males sing to themselves—it turns on as the males are practicing variants of their song. But when the males begin singing to females the FoxP2 slightly up-regulates. This is an extraordinary result, as the same kind of motor control is obviously at work as FoxP2 regulates differently. FoxP2 therefore cannot just mean motor control. If, as I expect, non-singing females similarly have FoxP2 regulation in Area X as they listen to the male’s song, then a decisive blow would be given to the theory that the gene is only involved in motor control. This would mean that the females are essentially processing the outputs of the males’ grammar. It would also be the indirect indication we need that FOXP2 is central to language.
The connection between humans and songbirds goes even deeper than all this. The finch’s FoxP2 differs from the human’s in only eight out of 200,000 positions, and the brain circuit that operates during birdsong is functionally equivalent to one of the subcortical brain circuits involved in human language. The reason the birds do not exhibit language, then, is probably because their brains just lack much of the outer cortex that we have.
Human language is known to reside within the Broca and Wernicke’s areas of the left hemisphere in the outer cortex, but parts of the inner brain are also involved in language performance and acquisition. In the human brain, theregion that is functionally equivalent to the bird’s Area X is the caudate nucleus, found deep within the basal ganglia. This domain appears to be critically involved in what may be thought of as operational memory, of the sort you need to rapidly remember sequences such as when your fingers touch a keyboard to play a song or type out a sentence. It also regulates how you tie your shoelaces and when you dance or clap in rhythm.
I have put forward a theory, with Massimo Piattelli-Palmarini, that the caudate nucleus is key to understanding syntax in the brain. This area should be involved in what linguists call parsing: the integrated processes that allow you to reconstruct complex sentences as you hear or see them, to produce them, or to acquire the fundamental parameters of your language as you first experience language as a baby. Because of the similarities in brain structure and in the syntax of their song, finches must also have this parser. They may not be using it to process complex thoughts because they lack the cortex to generate them, but their songs are certainly intricate enough, and reconstructing their subtle structure from a linear sequence of notes is a remarkable computational task. If songbirds share with us the parser necessary for language, their brains hold an experimental key to understanding how we speak.
Language is fundamentally genetic and its essential structures can be found in other species, like songbirds. What then of our nearest relatives with relatively large cortices, such as chimps? Why can’t they also help us to understand language this way? Chimps, and our other close relatives the apes, certainly have the hardware for some basic forms of meaning, but all indications are that Neanderthals also had meaningful thoughts, enough to bury their dead or control fire, without much of a language. What they don’t have is a way to externalize their thoughts. I’d wager that chimps just lack the parser that FoxP2 regulates. Somehow humans, by contrast, were able to recruit an ancient gene with a relatively ancient function to help us squeeze our thoughts out into the airwaves, much as a finch does with his. We are just thinking apes, with a finch’s ability to sing.
This idea should be relatively easy for us to test in a number of possible experiments. The publication of the Neanderthal genome should tell us just how different their FoxP2 gene really is from our own. As interest in songbirds develops, genetic knockout technology will also allow us to know exactly what happens to a finch’s brain when FoxP2 is affected. We will learn whether or not it does regulate the parser, and even more concretely, how it affects operational memory. Using in-vitro experiments with FOXP2, we could further determine what happens to a chimp’s cells when this bit of human DNA is inserted, thereby effectively modeling—at the cellular level—what must have been the evolutionary path of the FOXP2 protein in our forebears. Within a decade or two, these tasks could finally explain how we speak, at the genetic level.
Eventually non-invasive technologies will emerge that should allow us to see the exact role of genes in language performance in humans themselves. There might be a few rather humbling surprises awaiting us then. There may not be anything special about our seemingly unique access to language, and once a species has it, all of the resulting cognitive and cultural benefits might ultimately accompany it. In the meantime, we can be confident of the fundamental role of FOXP2 in human language. Nothing in the picture I have laid out leads us to doubt that somewhere down the line we will figure out the full details of language in our genes, as Chomsky once predicted. What he couldn’t have predicted is that the correct answer would be an equation as surprising as Chimp + Finch = Human.
Originally published September 25, 2007