Phonotactics

Combining speech sounds in clusters, long vowels, diphthongs, syllables

Phonotactics (from Ancient Greek) deals with with what a language allows in terms of the structuring of consonants, vowels, and syllables. This is highly language-specific and thus necessarily learnt.

For any given language, its phonotactics defines which sounds can apear in which positions and combinations, in which syllables, in which sorts of words, in which sorts of use. Phonotactics defines the consonants in what are known as ‘clusters’ before and after vowels, as in cry and elm and crisp, whether these are sequences of sounds or phonemes, as they are in English, or complex segments, whether vowels are allowed to combine with one another in what are known as diphthongs, as in oh and why, whether either consonants or vowels or both occur only as singleton elements or alternating with long or doubled variants. The phonotactics is thus a branch of phonology.

There is a spectrum here in what a language permits. English, as what has been called a ‘vaccum cleaner of a language’, freely importing words from other languages and then usually adapting them to English phonotactics, is close to the permissive end. What English does with its quite complex phonotactics – stringing sounds together, some West African languages do inside the sounds, allowing the same or similar actions in different parts of the vocal tract, as by the name of the African language, Igbo, with simultaneous closures by the lips and the back of the tongue. In English, by contrast, the G and B are sequenced in egg box, dog basket, and between syllables in Digby, Rugby, but crucially not simultaneously as in Igbo. English has many such sequences which could easily be misheard as single sounds. So in children’s occasional pronunciations of spaghetti as PSKETI, the sequence PSK may be miscontrued as a complex sound.

Languages vary widely in all of these respects – in:

Whether all syllables begin with a consonant, as in pie, lie, high, cow, or whether they allow syllables to begin with a vowel, as in eye and owe;
Whether no syllables have a consonant after the vowel, as in he, she and they, or whether this is allowed, as in him, them and us;
Whether consonants cluster, before the vowel as in play, clay, pray, crew, spray, straw, screw, or after the vowel as in toast, whisk, help, melt, tent, next, glimpse;
Whether vowels diphthongalise, as in my, may, mow, bow, boy, with the tongue rising in the mouth as the articulation proceeds;
Whether and when consonants double or ‘geminate’ as in between the adjacent N sounds in non-native and unknowing, where native and knowing are fine on their own unless they are turned into their opposites by non and un, neither of which count as words on their own;
Whether vowels lengthen contrastively, or whether as in English knee, true, yaw, this lengthening is also marked by an increased ‘tension’ in the tongue’ or whether these sounds would be better considered as ‘double vowels’ or ‘geminates’;
Whether consonants and parts of consonants combine as in the first and last sounds of church and judge, beginning with a complete closure and ending in an only partial closure in what is known as ‘affricates’;
Whether and how far sounds like the first and last sounds of church and judge are allowed to combine with other sounds, as in squelch and hinge, with nothing like this before the vowel;
Whether the R in the spelling of her, fir, fur, should be considered as part of the vowel in varieties like Home Counties English where is goes unpronounced;
Whether there are syllables like the L in little and middle or the TION in station which are always stressed in the same way – unstressed in these cases – and pronounced with the same speech sounds or ‘phonemes’;
Which sounds are allowed to form the nuclei of syllables, in English all vowels and N, L. M, and in some varieties of English R, in button, little, prism and butter.

English is uncommonly permissive in most, but not all, of these respects, allowing consonants to stack up before or after the vowel or ‘nucleus’, with consonants like L, R closer to the nucleus than consonants like T, P, and K, but only one consonsonant, S, before two other consonants, TR, PR, PL, and KR in stray, spray, splay, and screw, up to three elements after the vowel – in glimpse, next and length. So shlep, shtum, and shpeel from Yiddish are freely imported and pronounced with the non-English cluster on the left. And German nazi and Italian pizza are mostly pronounced with the similarly non-English TS at the beginning of the final syllable.

But on many of these points, the evidence of the spoken language may be not clear and obvious to the first language learner. English contrasts tip, ship and chip. So chip could have an initial T SH cluster. If that was the correct analysis TSIP, beginning like tsunami, pronounced with an initial TS, would also be predicted as a possible word, complicating the phonotactics.

English is much less permissive with respect to the internal rhythm of words or their pattern of stress. So in Russian names like Khrushchoff, Gorbachoff, Vladivostok, Borodino (the place of the decisive 1812 battle against the forces of Napoleon), and in Spanish Trafalgar, all with stress on the last syllable in Russian or Spanish, the stress is shifted one or more syllables to the left in English.

What the learnability space has to allow

The words, strength, strange, scrounge and change would not count as possible words in many languages. Such languages would disallow the STR and SCR and NG combinations and the final TH in strength, the way the GE is preceded by an N and the fact that the vowel begins and ends with the tongue in different positions in the mouth in strange, and scrounge and the CH and GE which use different airstreams at the beginning and the end of the sound in change.

This complexity is entirely limited to the ‘content’, or ‘encyclopedic’, or ‘lexical’ words, the nouns, verbs, adjectives, prepositions, and adverbs, like death, die, dead, in and sadly. ‘Functors’, like the, a, and the S in hits and the T in slept are all built more simply, as by these examples. One variation, between the pronunciations of TH in functional this and the, and in lexical think and thought, is entirely defined on the contrasting categorisations – or places within the ‘spine’, by the proposal here.

The importance of definitions

The terms and the ordering of the definitions are not obvious and thus a significant issue for learners. All languages display a pulsing in speech. But languages vary in what the pulses consist in and whether the pulsing is just in speech or in both speech and the way entries are stored in the lexicon. Because of this wide variation, by one widely accepted model the core of the pulsing is by what are known as the ‘nuclei’ of syllables. In most languages, the nucleus of a syllable is a vowel. But in languages like English, the syllabic core, the nucleus can also be what is known as a ‘sonorant’, a consonant with a high level of resonance – L, R, N or M, as in little, letter, button or bottom. The ancient languages of North Africa are even more permissivc allowing any phoneme to be represented in the lexicon and spoken as syllablic nuclei. The semitic languages spoken in the area around the Eastern Mediterranean, including Arabic, Hebrew, and Amharic, allow entries to be stored in the lexicon by consonants alone, with the vowels inserted in the course of speaking. The languages of the Caucasus such as Georgian allow very long strings of consonants at the beginnings of syllables, like MTSVANE in the name of a well known wine. It may be that the M and the TS are better considered as syllaibic nuclei without standalone vowels. Russian, in long term contact with the languages of the Caucasus, allows some of these things in particular, isolated words. It may also be the case some apparent complexities can be reduced by limiting the building of words to quite narrowly defined derivational steps. It is an open question whether the system here is defined on timing or on the phonemic content. The human learnability system has to be such that all of these systems are learnable as part of the core of any given language.

The child learning English has to resolve the issues here on the basis of evidence which is not uniformly clear.

Increasing or decreasing the complexity

On some accounts, in cases such as twelfths, there are plainly four consonants after the vowel. But the final S is a plural and the TH is arguably a derivation from a root form as twelve.

The complexity is reduced if syllables are built, or derived, in stages, with S in string, and the T in next both built after the rest of the structure, significantly both differing with respect to what the Sound Pattern of English characterises as ‘Continuance’.

See also Phonetics, Phonology, and Syllables.

Phonotactics

Combining speech sounds in clusters, long vowels, diphthongs, syllables

What the learnability space has to allow

The importance of definitions

Increasing or decreasing the complexity

Information

Services

Contact