
Phonetics
What makes a speech sound or 'phoneme'
In contrast to Phonology which studies speech systems as systems – both within a given language and across languages, and is thus a broad-brush discipline, largely committed to binary distinctions because this is what is forced by neurophysiology, Phonetics studies the fine details of the speech sounds or ‘phonemes’ of an accent or dialect or variety or individual phonetician. Phonetics tries to be as exact as possible. It is thus highly circumspect about any commitment to binarity, no matter how well motivated this may be in theory. Binarity just seems to get in the way of ultimate precision or to contradict it absolutely.
With this orientation, many phoneticians doubt whether an overall system of sound systems is even possible in principle. Doubts often centre on whether there is any such thing as a language universal system of features, or factors differentiating speech sounds.
The real issue here is that phonetics is essentially a taxonomic discipline, measuring points on scales of relative positioning, opening, protrusion, whereas phonology is essentially systematic, with things either happening or not happening. Many phonetic distinctions seem to be points on a scale, and thus necessarily multi-valued, unlike the categorial distinctions of phonology.
With this tension between the two disciplines, it is hardly surprising that there are different descriptive traditions. The long, tense vowels of English are commonly characterised in at least three ways:
- With separate symbols for the on and the off-glide;
- With just one symbol followed by a colon to denote the length;
- With just one followed by lower case H, considering that as a moment of inactivity with the tongue in the position for the nearest vowel.
The initial sounds in shoe and chew are correspondly written in different ways according to how they are classified in theory. Some of these differences between traditions seem to be at least partly geographic, the notion of an International Phonetic Alphabet notwithstanding.
In 1988 the phonetician, Peter Ladefoged, was persuaded to collaborate with the phonologist, Morris Halle, on a study which proposed that the critical factor distinguishing the various constrictions of the vocal tract was not where they occurred but which articulator was used to implement them. They characterised all sounds with the lips as ‘Labial’, with the front of the tongue as ‘Coronal’, and with the back of the as ‘Dorsal’. But Ladefodged then seems to have changed his mind about the overall wisdom of this project, not referring to it again in bibliographies, reiterating the older notion of phonetic ‘Place,’ mentioning the notion of active articulators only as a descriptive device, and explicitly denying any possibility of binarity and thus any plausible notion of a universal schema. But in his later work with Ian Maddieson (1996) he does list six ‘active articulators’ in his 1988 terminology, although there is an uncertainty about whether the sixth should be considered an articulator. From front to back and downwards, they list:
- The lips;
- The ‘tongue tip’;
- What they call the ‘blade’, around eight millimetres further back;
- The ‘body’ which articulates against the soft palate and the uvula, the protrusion about the size of a small grape, hence the latin name, at the back end of the soft palate;
- The root, not systematically used in English, but extensively used in consonants in many varieties of Arabic and the vowel systems of many sub-Saharan African languages;
- The glottis or vocal cords, used in all vowels in all languages, and in the glottal stop, widely used as an alternative articulator for T in many non-prestigious varieties of English, but in huntsman and other similar words in ‘standard’ and prestigious varieties.
This small wobble in the mind of one of the world’s most accomplished phoneticians is characteristic of the considerable and irreducible tension between the two disciplines.
Speech and language therapists and speech pathologists, mostly trained by phoneticians rather than than phonologists, tend to classify speech sounds by where they are implemented in the vocal tract rathere than in any other way. So the clinical literature contrasts labio-dental and interdental fricatives, F and V in contrast to the two pronunciations of TH in thin and then, and velar as opposed to alveolar stops in key and tea respectively. Going into more detail, the phonetic literature refers to a large number of different systems, all involving more than five or six relevant places or positions of articulation, especially with respect to fricatives and affricates.
The relevant data are often hard to get at. The mobile structures are all soft and squidgy, and behind the lips mostly invisible. But the musculatures can be manipulated with great speed and precision. The body of the tongue can be squeezed into a pit or a hump on demand. And the lips can be separately opened and protruded. A millimetre here or there and a few milliseconds sooner or later are all perceptible in their effects on any given phoneme.
From an acoustic perspective what matters most are the various area functions. Human mouths vary quite widely in the three dimensional shaping and the spacing of the teeth – if all the teeth are present. So ‘normal’ articulations vary quite widely too.
Largely for ethical reasons, there are limits on who can or should take part in experimentation. X-ray exposure is dangerous. So researchers may choose to experiment on themselves and take the risks rather than not experiment. At least two world class scientists have died almost certainly as a result of over-exposure to X-rays, Marie Curie and Rosalind Franklin, neither of them phoneticians.
For speech and language pathology, the greatest point at issue is with respect to consonants, particularly the dorsal or velar stops in K and G, the fricatives in S and Z, the affricates in chew and jew, between the glide in you and the liquid in Looe, and between the other liquid R which, it is sometimes suggested, may be in the process of becoming a glide in English, and thus likely to trigger a variety of other changes, especially to L.
To a lesser degree, some children have issues with the vowels. Very rarely, this is the main issue.
What makes some sounds easy to say, and others hard? One quite subtle diagnostic is how well second language learners cope with sounds in loan words and foreign names which do not occur in their native language. English speakers seem never to have had any difficulty with the TS in tsumami, pizza or Fritz, all seemingly irrelevant until the First World War. But many English speakers have great difficulty with the first vowel in muesli (as the word is commonly spelt on cereal packets) in the original German. Swiss, Austrian pronunciation, with the tongue in the position for EE and the lips in the position for OO. Similarly, many second language speakers of English have great difficulty with the cross-linguistically uncommon vowels in rum and ram, both pronounced with the tongue relatively low in the mouth and keeping apart the vowels in rim and ream pronounced with the tongue high and at the front of the mouth, differing in their length or the extremity of this position. So native speakers often have difficulty telling whether such second language speakers are saying ninety or nineteen.
A phonological reconciliation
Tellingly, against the idea that there is no universality here, the writers of phrase books believe that whatever language they are describing, its sounds can be categorised as some variation of the sounds of the language whose speakers the phrase book is designed for. If there was no universality, this would be absurd. Native speakers would have no idea what the second language speakers were trying to say, even if they sometimes have difficulty. And capital cities and tourist hot spots would be littered with thrown away phrase-books.
But by the proposal here and by the approach to S and Z issues discussed here under that heading, there is a possible reconciliation between the orientation towards the finest possible detail in phonetics and the search for universality in phonology. If underspecification is not just ‘radical’, as suggested by Diana Archangeli in 1984, but maximal, as suggested by the proposal here, if the underlying representation of speech sounds in the lexicon is by the smallest possible set of features to keep words apart, the rules necessary to generate the forms of pronunciation have, of necessity, to be ordered in language specific ways. For the sake of maximality, these may sometimes interacting with syllable structure, as in the case of English S. Here S is only possible segment if there are two following consonants before the vowel. Of necessity again, the ordering here has to be learnt in full and from scratch, with the ordering critically significant. The earlier any given implementation is ordered, the less it is likely to be involved in variation. This is detectable, and may be critical in the natural process of learning. Whatever the number of steps, there are as many orderings as the factorial of that number. If there are, as suggested under the heading of S and Z, eight critical features here, only one underlying, there are seven factorial or 128 possible orderings of the implementation rules with a corresponding variety of phonetic effect.
Putting things very briefly, by the approach here, in a way which has been said many times before, phonetics is late phonology.
See also Features, Phonology, Phonotactics and Syllables