
Biolinguistics
Linguistics as a branch of biology and mathematics with some surprising treasures
Speech and language are the most complex phenomena in biology. Yes, biology. So many of those working in this area often refer to it as biolinguistics. If human language was entirely the product of human culture, as argued by Mike Tomasello (2003, 2010), or if it was just a product of the need or desire to communicate, it would be expected to vary in all respects other than those which serve some given purpose. But it doesn’t.
All languages allow sentences to get longer and longer with no point at which this becomes impossible. So we can say, “She’s lying.” Or “You know she’s lying.” Or “I think you know she’s lying.” And so on. The child may only hear one step – as in “You know she’s lying.” But without needing to be told, learners somehow know that any number of steps are allowed, or they wouldn’t understand any numbers of steps greater than what they happen to have heard. This is known as ‘discrete infinity’, discrete because the structure is built from a small, finite number of elements, the sounds of the language, infinity because the number of possible sentences is infinite.
In 1965 Noam Chomsky proposed that the simplest explanation of discrete infinity is that there are underlying principles for language that we are born with, that are, in a species-specific way, encoded in the human genome. These principles are necessarily simple and abstract. The richness and complexity of human language is by the way these principles interact. By this reasoning, linguistic structures are ‘generated’ or built, rather than strung together as sequences of words.
Because all languages have this property of discrete infinity, other than by the highly questionable, and, I believe, quite mistaken claims of Daniel Everett (2009, 2023, 2018), the simplest explanation is by postulating that the commonality is by a property of the human species.
To a large degree, the development from the Transformational Generative Grammar, or TGG, of 1965 to the Biolinguistics of today is due to Chomsky. The notion of TGG is no longer appropriate. The role of transformations has now diminished or been eliminated entirely. But WHY does language work the way it does? As in 1965, the simplest explanation is that at least to some degree, the Faculty of Language is specified by the genome, as an aspect of biology, rather than psychology. This view, surprising to some, is hardly surprising to those familiar with the clear evidence of genetic and hereditary factors in speech and language disorders. But how is this if meanings are products of the mind? How do biology and mathematics come into it? By the proposal here, this involved a series of steps going back to a very early point in human history, the point at which the first words entered what would become the distinctively human lexicon, approximately a thousand times more sensitive and more capable of expansion throughout life than any non-human equivalent. And this became part of the normal human inheritance, necessarily expressed biologically. But in order for the linguistics to be expressible biologically, there has to be a mathematical foundation, as argued by the most recent work by Matilde Marcolli, Noam Chomsky and Robert Berwick (2023).
By the proposal here, there was an event in human evolution by which a particular algebra became accessible to cognition. This algebra was and is what is now known in computer science, information theory, critical path analysis, and the framework here, as the algebra of rooted, planar, binary-branched trees. The simplest possible statement of this tree has to have been primordial. This must have been how some human ancestor or ancestors started to differentiate themselves from others sharing the same ancestry with modern chimpanzees. This must have been such that it could be stated in a form which could be read by the biology so that it could become part of the human genome. By the proposal here, this algebra developed in seven steps, each building on all the previous steps, each useful for thought and communication, but only transmissible by virtue of the mathematical basis. Crucially, the algebra is recursive, allowing the grammar to generate the infinite set of structures we know as language.
Intuitively, when we look for a word or want to say something the structures come to mind more or less instantaneously and reliably, often with subtle shades of meaning. Or so it seems. But ChatGPT notwithstanding, there is a substantial task here of finding appropriate words and parts of words and assembling them in a particular order at the tempo of everday speech, language and conscious thought. There is a large task of searching the lexicon. This drives the evolution of the framework here. This process of retrieval and assembly is universal across human languages, despite the obvious differences between them. Thus in English we say ‘a clever person’. And in German and Dutch, the equivalent words are sequenced the same way. But in French, Italian, and Welsh, part of the ordering is reversed. And that seems like a significant difference between two types of language. But by the framework here, these differences, and others far more extensive and much more difficult to teach to adult second language learners, are in fact quite superficial. Underlyingly there is a universal notion corresponding to what is said in English as a clever person. It just surfaces differently according to comparatively minor language particulars.
Sometimes this process of retrieval and assembly breaks down, and we can’t thing of a particular word or name. It is on the ‘tip of the tongue’. Or we can’t find the right way of saying what we want to say. Or the small child wants to make some deep observation, seemingly beyond his or her years, and it comes out with one or more grammatical defects. Or the older child with a developmental language defect struggles to articulate some insight in a way which can be clearly and reliably understood. Or the normally competent speaker confuses two or more elements of a chosen word.
But by and large the human faculty of language is remarkable for its everyday robustness and reliability, rather than its failures or the period of around ten years during which speech and language are normally being acquired. This robustness and reliability depend on the simplicity of the underlying code, in the sense of machine code in software. At the moment our best way of stating this is by an algebra. It represents a deep and powerful general resource for human cognition.
As well as allowing the development of human language, the recursive property given by this algebra can be, and is, usefully applied to all branches of mathematics, including arithmetic and the way we count.