Free Consultation: 07834 603925

An inventory of the sounds of English

On most counts, there are 44 phonemes in English (with slight variations between dialects, including different versions of ‘Standard English’.)



have the airstream completely blocked by the lips, by the back of the tongue, or the tongue tip, first the voiced stops, with the vocal cords allowed to vibrate as soon as the closure is released and then the voiceless stops, with a significant pause between the release of the closure and the vocal cord vibration.



with the airstream partially blocked by the upper teeth and lower lip, the middle of the tongue, the tongue tip, and the tongue tip and the teeth, first with the vocal cords allowed to vibrate and then not. The consonant in azure never occurs at the beginning of a syllable.



with a complete closure by the tongue tip released after a brief moment to allow a partially blocked airstream slightly further back along the tongue, with voicing and without.



Always next to a vowel, and with the characteristic resonance of vowels, but mainly functioning as consonants in relation to the structure of the syllable. R may be in the process of transitioning to a semi-vowel in what are sometimes known as ‘non-rhotic’ varieties of English in which R is never pronounced after the vowel in words in their citation form.



With the airstream blocked at the lips or the tongue tip, but allowed to go through the nose by opening a ring of muscle at the back of the mouth. By virtue of the closure in the moth they count as stops. (N clustered with G at the end of the syllable, as by the writing, is often counted as a phoneme, by native speakers of English varieties in which the G is not heard as a stop, as it is in many northern British English varieties. But the G is fully pronounced as a stop in all varieties where it occurs as the first sound in an unstressed syllable at the end of root forms of words, as in finger, single, mingle, and so on, unlike singer, ringer, and winger, where the root forms are sing, ring, and wing, respectively.)


Both L and N on their own after a stop after a stressed vowel can function as the nucleus of a syllable in at least most varieties of English in words like little, middle, and funnel. This is quite unusual across languages, and thus very much something which English learners have to learn. This is evidently quite hard for many normally developing children who shift the articulation back along the tongue in words like this to something like K, G, and NG as a single phoneme.

Semi-vowels or glides

By a gesture with the lips or tongue, colouring the vowel, always immediately before it in at least most modern varieties of English.


If R is becoming another semi-vowel and a third member of this set in at least some varieties of English, this may represents a significant problem for learners of those varieties.


By a momentary opening of the vocal cords or glottis, with the sound of friction from the resulting airstream, and often classed as a fricative on this account, but different from the fricatives proper in that it never clusters with another consonant, but always occurs before the vowel and never after it.


On most analyses, the letter X represents two phonemes, K and S. But on the basis of words like next and text, an argument could be made for counting it as a complex phoneme, as represented by the single letter.


Minimally there might seem to be 18 vowels, but on some counts more

The short vowels, going round the vowel space, starting with the tongue high at the front of the mouth, down to the bottom of the mouth and up to the back of the mouth, rounding the lips progressively in the last two cases.


The long vowels, following the same sequence, with more tension in the tongue and with the tongue closer to the edge of the vowel space at each position.


The diphthongs with the tongue rising in the course of the articulation.


‘Schwa’ with the tongue in the middle of the mouth, never with any stress


The same tongue position as schwa. but with stress, always written with an R, which is pronounced in many varieties of English.


Not all speakers make all of these distinctions. Some speakers make more.

For some speakers (some British, some American) tune and moon do not rhyme. For these speakers, if tune was represented phonetically there would be an initial Y as part of the vowel, with the same vowel moon, but no Y. On this analysis, speakers without a rhyme here have an extra vowel – like what Russians consider to be the single initial vowel in the name Yury.

Schwa is often defined on the basis of stresslessness. In words like data, agenda, media, criteria, pronunciation mostly follows the spalling, with no R. In words like mother, father, brother, and sister, the R is pronounced only in ‘rhotic’ varieties of the language.

In the two commonest words in the language, a and the, there is a change, adding an N to a or changing the vowel of the, if the word is followed by a vowel. These are the only words in the language which do this.

The vowel in hurt is articulated with the tongue in the same position as in unstressed schwa, but always written with an R, as in fir, fur, her, earn. Phonetically this is like a long or stressed schwa. But on what seems to me a better analysis, the length is from an R which just goes unpronounced in ‘non-rhotic’, sometimes considered ‘standard’ varieties of British English.

In a different way there might be said to be another vowel in the word the as it is said in the evening. It is not the same as either of the vowels in he and hip, but somewhere in between, technically short and tense. This is sometimes known as ‘schwi’. But there are various possible analyses.

On some counts, a vowel can have three elements – sometimes known as ‘triphthongs’ – with the tongue doing a double movement in the mouth. By many views (including mine), this is impossible, on the basis that human speech and language never divide anything into more than two parts. For speech in which the R is not pronounced, as in all varieties of London English, the R has to be somehow represented in the speaker’s mind, even though it is pronounced only as a schwa. But for some theorists, there is no such thing as anything which is just represented in the mind, and not pronounced. The theoretical issue about triphthongs and what is sometimes known as an ‘underlying’ R concerns the vowels in these words in most British varieties of English.

With the off–glide by an articulation with the tongue tip


With the off–glide by a back of the tongue articulation


Without an off–glide, but still a schwa gesture


For those who believe in triphthongs there are thus six or seven additional vowels in English.

Vowel systems as complex as that of English are unusual. Greek and Italian are more typical with just five vowels. Complex vowel systems are often unstable with vowels gaining or losing a feature in a human life time. In British English, this is particularly the case with the vowels in hock and horse wobbling in both directions. In the first world war, most members of the officer class pronounced horse as though it was HOSS, much to the silent merriment of those they commanded. Some of their children pronounced cross as though it was CRORSE – the opposite way round.