If there was, as I contend here, a sequence of evolutionary events, this forces the conclusion that there must have been corresponding stages of ‘protolanguage’. But there is nothing like a proto-language spoken by any modern population. All fully developed adult languages use the resources by all the speciations postulated here, though in different ways. But while the entire linguistic apparatus by these speciations has diffused across the whole of the modern human population, it is still developmentally vulnerable.
Tools from a toolbox
Evolution is not exactly recapitulated in the life of the organism. The ear does not start out as a set of gills and grow into the elegant engineering of the cochlea, incus, malleus and stapes. The evolution of speech and language has left a tool box. Some of the tools can only be take out in a particular order. Move cannot be accessed before Merge and Label.
Acquisition tells its own story in the typical ordering of language acquisition.
First there is a long period during which the child says only single words or what Martin Braine called ‘holophrases’ – expressions which sound like they might contain more than one word – but not occurring on their own – like Ozah, as an expression of apparent curiosity, possibly modelled on “What’s that?”. Then words start to be put together. The child says something like “duck bath” with two elements relating to two significant entities in the child’s universe, such as, in this case, duck, and the duck’s place, in this case bath, Merged together in that order, in a primitive prototype of a phrase or sentence.
Then as I discovered when I was doing the research for my MA in 1976, between a week and two months after saying something like “Duck bath” most naturally interpreted as a simple ‘declarative, as such structures are known, presumably meaning something like “The duck is in the bath” the child either asks a question like “Where duck?” or answers a fully formed corresponding question by an adult like “Where’s your duck?” by an appropriate and plausible reference to place, possibly by a single word. But never in the opposite order. In other words, two word declaratives by Merge always precede one word answers to questions by Move, even though the one word answer might seem simpler. In a period of up to two months, sometimes only a week, the child starts to correctly analyse Move, with words like where correctly linking to an abstract element on the right edge from where they have been copied. This happens to the ‘WH’ forms, who, where, and so on, one by one.
The learner learning
The proposal here upholds the notion of a single derivation, but by a device with several terms each of which can have several values, with no value represented more than once. The structure comes by the genome. What the learner has to learn is what can be merged and when, which labels are allowed, how the elements are fitted together or more precisely refitted, what can be moved where, and how the derivation is divided into phases.
Most of the apparatus for syntax and phonology is known by most children by the age of five. But as Carol Chomsky showed in 1969, there are subtle aspects of the grammar which are not mastered until nine or so. In 2002, I showed that the final stages of phonological acquisition are still normally in process at eight or so.
Functionalities can be achieved in different ways. In some languages, such as French and Russian, respect is marked by the use of the plural, in Spanish by a separate term, originally from Arabic. The English auxiliary system has been changing since the time of Chaucer. Might was once the past tense of may. But it now encodes possibility. For Charles Dickens “We are going to dine well” would have suggested that the diner expected to go from one house or room to another. Now it suggests a future by some sort of human agency, as in “Perhaps we are going to die” by a diner frightened of being poisoned, as opposed to “We will die” as an expression of a general certainty. The change to the auxiliary system is on going. Many English speakers under the age of 40 say things like “That really suits you, innit?” which for older speakers are almost uninterpretable.
Languages vary in which functionalities they express and which parts of the available apparatus they use. To express continuity, English glues –ING on to the rightmost edge of a verbal element, but uses parataxis by words like maybe, allegedly, to express evidentiality, but with less precision than by dedicated elements within the grammar. French does the opposite, using parataxis for continuity, and the subjunctive for evidentiality. Such variations are accidental.
Making sense of ordered disorder – treatment
Key to the understanding of asymmetric disorder is the idea of sounds as spoken in speech being ‘underspecified’, with their representation in the brain by the smallest possible set of features and speech built as much as possible ‘online’. The original insight here is from the 1984 work of Diana Archangeli. Following her logic, in what are known as ‘derived’ forms in the second syllable of Southern British English messes, washes, catches, and edges, the I vowel as spoken is built automatically. Archangeli has since rejected the idea. But it can be rescued by combining it with the 1995 insights of Carole Paradis and Jean-François Prunet, who proposed that the tongue tip articulator itself acts as a default, at least in languages other than those with uncommonly small phonemic inventories which seem to select the back of the tongue articulator. For obvious logical reasons, the default setting is made last. Everyday speech is thus by the equivalent of a decompression algorithm. The effect is that when speech is entirely unconscious as in dreams, there is no online, and the process of production is many times faster. But the decompression algorithm has to be learnt. In speech that is less than fully competent, steps can be delayed or brought forward with respect to the articulator. Contrasts are vulnerable in different ways in different words at different stages in the process of acquisition.
Characteristic incompetences, commonly described as the ‘processes‘ of child speech, are mostly with respect to Phase, too early in fronting, stopping, and too late in calculator as KALTALATOR. In cardigan as KARDINTON, there is indeed a sequence of steps, changing the G to a D, losing the voicing of the stop, and copying the nasality of the final consonant, one syllable to the left. But these steps are all very late, at the end of the derivation, inappropriate additions to it, all involving tongue tip articulations, after the stress has been assigned.
In non-pathological speech by many normally developing children of five, six and seven, hospital is commonly mis-pronounced by children as HOSTIPU and spaghetti as BASKETI or PSKETI. In all of these cases, elements of the phonemic structure are copied incorrectly.
In hospital competently pronounced, the tongue tip T at the beginning of the final syllable contrasts with whatever is left of the L sound. This often characterised as ‘syllabic’ because it works as a stand-alone syllable without an independent vowel. What is left consists mainly in a lip rounding gesture similar to the vowel in pull, put or book. But the native speaker knows that the origin of this is a tongue tip L, as evidenced in hospitalise with the L now at the beginning of a syllable.
In hospital as HOSTIPU, the T and P are reversed by what is known as ‘metathesis’. The tongue tip gesturing of the L is partially or completely lost in favour of a lip gesture, triggering a matching change at the beginning of the syllable. There are three steps here. First the lip-rounding of the final syllable is exaggerated. Second the lip action of the P is copied rightwards to the onset of the final syllable. Third, what is left behind at the start of the second syllable is a stop without a defined articulator. This is then said as a T.
In spaghetti, the child’s system may reject the SP cluster at the beginning of an unstressed syllable before the stressed syllable on the grounds that there is no other such word in the child’s vocabulary in contrast to the numerous cases like spy, spare, spit. So the S moves to the beginning of the stressed syllable, and the G loses what is known as its ‘voicing‘ to match that of its new neighbour, becoming a K. The structure is now more familiar except that the P has been left behind. It usually becomes voiced as a B. But in some children’s speech, as the S is moved, the initial unstressed vowel is left unrealised. And an initial cluster of PSK is formed in a pronunciation as PSKETI. Nobody would call this a natural way of making the word easy to say. But it has an easy derivation by one incorrect application of Move.
On an alternative ‘process account‘ BASKETI is commonly described in terms of ‘migration’. But this assumes a ‘process’ with only one common exemplar. It is more parsimonious to postulate a general Movc functionality which is justified independently in competent speech and language. On such reasoning, a process account is rejected here.
Many normally developing five year olds mispronounce magnet as MAGNIK. Here the back of the tongue articulation of the G in what is known as the ‘coda’ of the stressed syllable is copied into the tongue tip T coda of the final unstressed syllable, without being lost at the point of origin. The two codas contrast in their ‘voicing’, or the time relation between the release of the closure by the tongue. The effect is one of harmony or assimilation, as though the G / T contrast was too great for the child’s system to handle. This is similar to the two year old saying doggy as GOGI, except that the context is much less narrowly defined.
In the speech of normally developing children, in little and middle as LIKU and MIGU, even though there may be no overt tongue tip gesture, there is nothing in the child’s experience of English to suggest that there could be a word ending with the vowel in full or pull. The presence of the L is signalled in forms like fully and pulling in which it is at the beginning of a second syllable. And the child’s system retraces the history of the final U sound back to its origin as L, and increases the contrast by moving the tongue articulation back to K or G.
Something similar happens in the speech of children of seven or eight, who mispronounce monopoly as MONOKOLI. Here the environment is very narrowly defined with a lip action M before a tongue tip N, a lip action P after a stressed vowel with lip-rounding, and an L in the final syllable, capable of becoming a rounded vowel in other circumstances. Here the replacement of P by K has the same effect as in small children’s LIKU of increasing the contrast.
While the functionalities of Mimic, Merge, Label, Fit, Move and Phase evolved one by one, they are partially fused together by modern competence as tools in a toolbox from which languages effectively select. While a language may not express one or more parts of the total apparatus, all are available, with most languages having ways of expressing most of it, with no language lacking any term completely.
These selections are not random. Without the structure by Labelling and without the pragmatics of the pronominal system by Fit, very little will make sense to the learner. Possible selections are on points of detail.
Because speech and language sit on a highly structured genomic component, they are vulnerable in corresponding ways, with the greatest vulnerabilities with respect to the most recent evolution, less stable across the population than those components by earlier evolutionary steps. Most common speech disorders are with respect to Phase and the misuse of Move by the step before that. Stammering appears to involve the neurological correspondence of Phase, the buffer. Autism appears to compromise the pragmatic apparatus and thus the pronominal system by Label, more phylogenetically primitive and correspondingly harder to treat.
The evidence here
The evidence here of incompetence in speech is from children learning the local dialects in South West London. Some of the data about children’s asking and understanding questions is from to of my own children and some from four children in Edinburgh. Any or all of the asymmetries noted here could down to some local factor. So there is an obvious need for evidence from the speech of children learning other varieties of English and other languages. While there is already a vast amount of data on children’s speech and language, there is little data in sufficient detail to reveal any systematic asymmetries. The data is, to say the least, unevenly spread.
Making things easy?
Not really, but the complexity is just the way things are. Speech and language don’t fossilise. Recordings go back no further than the late 19th century. Reconstructing how English sounded at the time of Dickens, Shakespeare, Chaucer or King Alfred, 150, 400, 600 or 1,100 years ago is entirely from the written word, errors, local variations, and the analysis by contemporaries or near contemporaries. Reconstructing the development of proto-language is plainly a difficult task. The proposals here are just the first stage of a corresponding research program. My own personal motivation is to help children who have difficulties with speech and language, just as I once did. Others have incomparably greater difficulties.