In competent and incompetent speech
From the look of print, it might seem that speech is about getting a series of speech sounds in the right order, not confusing them for one another, and making sure that the set of sounds is complete (however we do that). In children’s speech there are often omissions and replacements of one sound by another. It is often said, almost as an axiom of clinical linguistics, that children’s speech should be defined in terms of ‘processes’.
But characterising children’s speech in this way leads to a confusion. There are also numerous proceses which characterise normally competent adult speech. Normal development would seem to consist in discarding one set of processes and learning another set. This raises the obvious question: How does the child learner know which is which? And the more serious a child’s issue, the more processes there seem to be, and the more complex they become. The speech of children with the greatest problems commonly falls outside any recognisable schema of processes. So no listing of developmental processes in child speech will ever be complete or explanatory. Worse, the processes sometimes stack up on top of one another, as in the case of cardigan as KARDINTON, with as shown in Nunes (2002), first the back of the tongue assimilating to the tongue tip D in the previous syllable, then the voicing being lost by a default process favouring voicelessness, and then the N getting copied from the ‘coda’ or end of one syllable to the coda of the preceding syllable.
It is obviously important to distinguish clearly between expressions of competence and incompetence. I propose that we need to distinguish between the well-defined processes of normally competent adult speech with one sound pronounced in various ways now known as ‘allophones’, seemingly first detected in early mediaeval Iceland, and the characteristic incompetences of children. The processes of competent and incompetent speech are different things. If we don’t make this distinction, our account of both normal and abnormal development breaks down.
Processes in competent speech – all recoverable
In a way that varies between dialects, there are different allophones of T, in toe, litter, little, stay, and hot, all by different processes.
Some dialectal variations are sharply stigmatised– like making a brief, momentary closure of the vocal folds or vocal chords in the glottis for the T in little, digital, and the first T in Potterton. This is particularly characteristic of the variety known as Cockney, the speech of working class Londoners, now disappearing, famously, but inaccurately described by Alfred Gimson (1962). He fails to notice that there were, and marginally still are, two varieties of Cockney, one traditional, the other heavily influenced by the Yiddish of recent Jewish immigration. The Yiddish variety was deliberately pastiched by Warren Mitchell in his performance as Alf Garnett in Johnny Speight’s 60s and 70s TV comedy series, Till Death do us Part, written as a protest against Anti-Semitism. Gimson muddles the two varieties, both widely spoken at the time he was writing his book in the area just to the East of the City of London, a short bus ride from his Bloomsbury office.
In the variety which Daniel Jones called ‘Received Pronunciation’, now generally known as RP, the same process occurs in huntsman, ointment and appointment between the N and the M. This glottal articulation of T, a ‘glottal stop’ is an example of ‘lenition’, a double-edged process which both sharpens a contrast, in this case with P and K, between subtle, supple, and suckle, and at the same time takes a step towards the loss of the phoneme over one or more generations of speakers.
In the word prince, there is often a closure of the larynx between the N and the S, making the word fractionally distinct from prints, in which the T is commonly pronounced. It is not clear in which this is happening most.
In RP, little is standardly pronounced with no release of the tongue-tip closure for the T before the gesture changes to one involving just the centre line of the tongue for the L.
One absolutely regular process with not a single exception or dialectal variation is the wrtten S, spoken as S in pats, Z in pads, and IZ or EZ in patches (with variation by dialect). The formalisation of this has been the topic of much argument.
By another process there are the contractions, marked in English by an apostrophe, as in I’ll, we’ll, I’d, we’d, where only the final L or D of the underlying will or would is pronounced, and in don’t and can’t, where the vowel in not goes unpronounced and the vowel of the host form is changed. I, we, will, would, can, do, not, are all cases of what are known as ‘functors’, words which get their meaning only from the way they interact with each other and the rest of the sentence. But these contractions only happen once on one edge of the phrase. So “I would not do it” only contracts to “I wouldn’t do it”, rather than “I’dn’t do it.”
By another process, one sound assimilates to another, in competent English when they are immediately next to each other in a particular sort of phrase. This reduces the contrast between two elements, as in “Good morning” in normally competent adult as GOOB MORNING, with the lip action of the M in morning spreading to the preceding sound, the D of good. Here just one feature is involved, the feature which defines the point in the vocal tract where a closure occurs.
By another process in the English of London and the Southeast of England and RP, an R is inserted between the adjacent vowels in India and Pakistan as “India R and Pakistan” and “Hosannah R in exccelsis Deo” and even between sentences in “I went skiing in Canada R, and I broke my leg”, but not where there is no meaningful connection as in “I went skiing in Canada, and you still owe me that money.”
In most varieties of English, there is what is known as ‘mutual assimilation’ or ‘coalescence’ between the T and the R at the beginning of a stressed syllable, as in try and triangle; the T is articulated further back in the mouth as for R and with a gradual release and the R loses the property known as ‘voicing’ – with the vocal chords allowed to vibrate against one another while the rest of the vocal tract is shaped for the sound. This does not happen in pastry and triangulate.
All these processes and many others are characteristic of competent adult speech, but in ways that mostly go unnoticed unless they are stigmatised or pastiched for comic effect. The processes that characterise a particular variety, associated with an area or diaspora or social class, seem to be unrelated to one another. They are ‘recoverable’ in the sense the competent speaker can detect their effects. But this does not forcibly hold for small children.
Non-recoverable ‘processes’ in incompetent speech
The not fully competent speech of small children, either delayed or disordered, is often characterised in terms of ‘processes’ quite different to those of dialects and other varieties.
- ‘Stopping’: The acoustic friction of an almost complete closure of the mouth by the tongue tip in S and F is commonly replaced by a complete closure by T in sea as TEA, ace as ATE, and so on. What should be near complete closure of the mouth becomes a complete closure instead. And the F in foot and off by the lower lip and the upper front teeth is replaced by a complete closure of the lips in pronunciation as PUT and OP. In ALL cases the gesture is momentary. Sounds or phonemes with an almost complete closure are known as ‘fricatives’ or ‘continuants’, and stops with a complete closure are known as ‘stops’. Hence the name of the process.
- ‘Fronting’: Early in speech development, the tip of the tongue stops, T and D, replace the back of the tongue stops, K and G, in all positions in all words, in K in key as TEA, like as LITE, Guy as DIE, and Mog as MOD. By the analysis in Nunes (2002) this is by the failure of one step in the ‘building’ of the back of the tongue sound. the step which defines the particular articulator. As the child’s competence develops, this happens in an increasingly narrowly defined set of positions in particular words, including calculator and archeopteryx. the tongue tip articulation tends to replace other articulations – particularly . And where there is just one consonant in the word, back-of-the-tongue articulations are lost in favour of tongue-tip articulations. So this is often known as ‘fronting’. The opposite happens in some children, and is usually known as ‘backing’, but this is quite rare.
- ‘Final consonant deletion’: The D and T in bed and bat and the G and K in bag and back, and other similar cases, are not pronounced. This violates a basic phonotactic principle in English which disallows lexical monosyllables ending with a short vowel. But this conforms to a cross-linguistically very widespread notion of a syllable by which it consists of just a consonant and a vowel CV – disregarding the issue of vowel length. And this should perhaps be called ‘coda deletion’ because the notion of a ‘final consonant’ begs the question: the final consonant of what? The word? Or the syllable? The latter is more plausible
- ‘Initial consonant deletion’: Much less common than final consonnt deletion, the B and P in boy and pea are not pronounced, with the words pronounced as OY and EA. Here the syllable is treated as though it consisted of only the rime, the onset going unpronounced.
- ‘Final devoicing’, the G in bag and leg are pronounced as K, and the D in bed and red as T – as BACK, LECK, BET, and RET. This reflects a very common process across the world’s languages, including Russian and German, suppressing the distinction by voicing at least finally in monosyllables and often in the middle of words. By ‘initial voicing’, the P in pea, the T in tea, and the K in key are pronounced as B, D, and G. It has been proposed that this is an essentially acoustic process, outside the learnability space, and thus most unlikely to be reversed.
- ‘Assimilation’: Between two consonants which are either adjacent or in the same positions in adjacent syllables, features and sometimes the whole segment assimilate from the one to the other, in anticipation in doggy as GOGI lost in favour of the back of the tongue G, and in perseveration in magnet as MAGNIK
- ‘Dissimilation’: Often for as long as two years many normally-developing children say little as LIKU and middle as MIGU with the tongue tip T replacing the back of the tongue K and G next to tongue tip L in the special role it has here as the rime of an unstressed syllable, know as ‘syllabic L’. . Here the tongue-tip T and D goes to K and G, seemingly dissimilating with the tongue tip articulation of the neighbouring L. Even in in adult-speech there is not much of a tongue tip gesture in the actual pronunciation here. The L is mainly signalled by a colouring similar to the lip-rounded vowel in hook. And in children’s speech the colouring is all that’s left in the spoken form of the L. But hidden though its tongue-tip credentials are, the L in little and middle is seemingly detected by children so as to effect a seeming dissimilation. Later in speech development, the lip-action P dissimilates to back-of-the-tongue K next to stressed lip-rounded O in monopoly as MONOKOLI.
- Many children say soldier so that it sounds like shoulder. S and SH sounds are what are known as ‘fricatives’., and D is a stop. Correctly, in most current varieties of English, the second syllable should begin with the same sound as jaw or jeep, with a stop becoming a fricative, what is known as an ‘affricate’. In soldier as shoulder, the property defining the fricative edge of the affricate moves moves left, leaving the stop. Instead of S we get SH. The air-stream is still squeezed through a narrow gap in the roof of the mouth, but in the middle of the tongue rather than at the tip. Interestingly, children don’t say soldier as JOLDER with the J sound moving as a whole, but only with one of its properties moving. Nor do they say the word as SOLDER, just losing the fricative edge. When such speech is investigated it often turns out that the child is well aware of the difference between soldier and shoulder, but unable to say soldier in the intended way. Soldier seems to be the only word in English with these consonants in these positions and an L in between. Sausages is often problematic in a similar way, sometimes said as SHOSIDIZ and sometimes as SHOSIJIZ, with the articulator from the right edge of the affricate in the final syllable getting moved to the beginning of the first syllable, but again only just this one feature of the sound. The L in soldier somehow increases the vulnerability of the fricative edge of the affricate in its original position.
- Migration: Many children say spaghetti as BASKETTI, with the S ‘migrating’ from the beginning of an unstressed syllable to the beginning of the stressed syllable, and the stranded lip-action sound becoming a B. But if this is a ‘process’ it is the only common exemplar. A few children go a step further, leaving out the first vowel, saying the word as PSKETTI. In this case it is as though there were two steps, the first moving the S one syllable to the right and the second losing the first vowel. There is no way a syllable beginning with PSK could be considered as anything other than highly deviant. It certainly doesn’t make the word easy to say in any simple or obvious sense.
- Lateralisation: Many children say yellow as LELO. A few make the same substitution in all cases, saying you as LOO, yes as LESS. and so on.
- ‘Metathesis’, the T and the P in hospital switch around, giving a pronunciation as HOSTIPAL
- ‘Coalescence’, the TR in try is treated as a single SH sound, taking features from both the T and the R. And the S and P in spoon sometimes combine the incomplete closure by S with the lip action by P in a pronunciation as FOON.
- ‘Reduplication’ or ‘doubling’: stressed syllables are often doubled or ‘reduplicated’ as in water as WAWA. In speech which is significantly disordered, this doubling is sometimes both prolonged and carried out in more complex ways, as in donkey as DODONG, plastic as PLAPLAK, Indian as HEEJING
Assimilation and dissimilation seem like contradictory processes. They clearly happen in quite different environments. But there is a commonality between them. In both the degree of contrast is being manipulated, reduced in the case of assimilation, and increased in the case of dissimilation.
Sometimes incompetent processes combine as in finger as TINNA with only the tongue tip action of the N (lost next to back of the tongue G) realised throughout the word with stopping in the onset and the whole segment copied into the next door position of the G. Such speech is mostly incomprehensible.
Now it might be said that these mispronunciations are ‘making the words easier to say’. But this is circular. And spaghetti as PSKETTI is clearly not easier than the lexical word in any natural sense of ease.
If many children have the same problem and solve it the same way, as by ‘fronting’, replacing K by T, as in key as TEA, it is hard to see how this can be by one and the same malformation.
Distinguishing two sorts of processes from one another – or not
While stopping;ing is almost unattested in competently spoken language, the opposite process, known as ‘spirantisation’ is common. (Consonants with a near, but incomplete closure of the mouth like S and SH were once called ‘spirants’). In the history of English the K in electric spirantised to an S sound when the root was turned into a noun in electricity.
On the approach here, the immature realisations of G, P and K in cardigan, hippopotamus, calaculator and archaeopteryx as KARDIDAN, HITOPOTAMUS, KALTALATOR, and ARTIOPTERIX are by failures in building these phonemes in these structures. The building of the lip articulation of the first P in hippopotamus, and the back of the tongue articulation in the second K in calculator, and the final G in cardigan, should all precede the building of the correct articulations. But the tongue tip articulation is incorrectly ‘brought forward’. This is by virtue of the very particular distribution of contrasts in these words.
The replacement of the tongue tip articulation in magnet as MAGNIK is triggered by the close contrast between the lip and tongue tip articulations of the M and the N and the weaker contrast between the G and the T on the right edges of the two syllables. In such cases the back of the tongue articulation is brought forward by an erroneous copying.
By the proposal here, the characteristic processes of child speech are not by malformations, but relate to the original very slow process by which human speech evolved, while the processes which characterise different varieties are by the random accidents which happen in communities as they separate by diaspora or subjugation or diverging identity.