Menu Close

S and Z problems

Sometimes known as ‘lisps’

The speech sounds or ‘phonemes’, which we recognise as S and Z, occur in most languages, and are yet the most problematic,  not just for children, but for many adults. How is it that most languages set such a high bar for something so nearly universal?

In all frameworks, phonemes can be broken down into components, by the framework here, by what are known as ‘distinctive features’. One distinctive feature of S and Z is that they are what are known as ‘continuants’ or ‘fixatives’, meaning that the airstream is forced through a narrow gap, creating what is known as ‘aperiodic’ acoustic noise. ‘Periodic’ sounds have a well-defined lowermost frequency. Aperiodic sounds don’t. The whistle of a kettle is (more or less) periodic. The hiss of steam is aperiodic, what sound engineers call ‘white noise’. Most fricatives are ‘strident’ with the gap tightly narrowed. Two English fricatives, TH in think and this, are not strident. All the other fricatives, those in fin, sin, and shin, veal, zeal, and these, are strident.  Stridency involves concentrating the aperiodicity in a higher register. The feature of stridency was introduced by Noam Chomsky and Morris Halle in their 1968 magnum opus, The Sound Pattern of English (although they oddly don’t define stridency in terms of acoustic frequency.)

The main distinctive features of English S are as follows:

  • The articulation is (for most speakers) by the tongue tip against the ‘alveolar ridge’, the fleshy area just behind the upper front teeth;
  • There is no actual closure; a small space is left between the tongue and the roof of the mouth.
  • Strident;
  • The opening to the nose is closed off;
  • The sound is a consonant, at, or close to, the outermost edge of its syllable;
  • In the case of S, unlike Z, the vocal chords are relaxed and separate, so they are not vibrating against one another,

For most people the easiest way to achieve stridency is with the highly mobile and densely innervated tongue tip. Some people, of whom I am one, say S and Z by making the channel for the air stream, not with the tongue tip, but with the blade of the tongue, positioning the tongue tip behind the lower front teeth. I can with great effort and concentration produce something like a standard S or Z with my tongue tip just behind the upper front teeth. But the result is not good.

Although features are not, in the framework here (and in most competitive theories), matters of degree, S and Z are in fact the most strident of fricatives.

Almost all issues with S and Z, commonly known as ‘lisps’, involve the loss of stridency. At least among children learning English in the South East of Britain, there are two common ways of misconfiguring the tongue position. The commoner of the two is by positioning the tongue too far forward, against the upper front teeth, or between the teeth. With an incomplete closure, the result is a sound similar or identical to the sound in TH. By a less common misconfiguration, the tongue is brought into contact with the roof of the mouth in the midline and the air stream is allowed to pass on both sides. The effect is very similar to the sound of what us written as LL in the Welsh place name Llangollen. This sound is often described as a ‘lateral’. This sort of lisp is much more difficult to treat than one with tongue tip too far forward. But it should be noted that allowing the air stream to go round the two sides of the tongue is quite different from shifting the tongue to the side of the mouth, something that is very rare across the world’s languages. It may be by a misunderstanding of the point here that proponents of non-speech exercises encourage tongue waggling exercises.) Significantly, I believe, both of these two common misconfigurations for S and Z have the same effect, of reducing stridency. So by my proposal here, most, indeed almost all, errors with S and Z are by ‘not hitting the stridency button’. And the task of getting a normal S or Z is best addressed by looking at the target phonemes in this kind of ‘featural’ way.

By an articulatory or misconfiguration account, the problem here is just by the configuration of the tongue, positioning it either too far to the front of the mouth or (less often) too far back resulting in a sound like Welsh LL. I don’t believe this account for five reasons – the four most obvious:

  1. On the simplest account, most misconfigurations of the tongue lead to the the loss of stridency.
  2. There are many ways in which children might misconfigure S and Z to avoid the supposedly difficult tongue tip configuration, and still get a variation of the acoustic noise spectrum sufficient to contrast with F, TH and SH. But they don’t tend to do any of these things.
  3. From information critically given to me in 1994 by the Welsh speech and language therapist, Olwyn Rhys, children learning Welsh and English sometimes have difficulty with S and Z. But hearing the lateral sound as they do in Welsh, they do not use this particular misconfiguration for S and Z.
  4. As ventriloquists know, a phonetically normal S can be achieved in various ways, other than using the tongue tip to form a narrow gap for the airstream to be forced through.
  5. Consider the obvious oddness in the fact that sounds which occur in most of the world’s languages are systematically mispronounced in a similar ways, so much so that the mispronunciations have a name. By the 1984 theory of Diana Archangeli, phonemes are BUILT one feature at a time. She calls this ‘Radical Underspecification Theory’ RUT. It made sense of some very complex puzzling data in a group of Native American languages, and some generalisations about a variety of languages including two varieties of British English – the different way that the second vowel in bridges is pronounced in London and the North West of England. Archangeli has now abandoned RUT, perhaps in the face of criticisms that it failed to account for some word formations in Basque. But Nunes (2002) shows a way of rescuing RUT. And by the proposal here, for as many features as there are in the formation of S, there may be the factorial of that number logically possible orderings in the building of S, each leading to a fractionally different acoustic result. As a result the stridency may be defined INCOMPLETELY, either too late or not at all. Or the positioning gesture may be mistimed with the effect of turning the sound into a Welsh-style lateral. This explains the variety of apparent realisations by a single model. and the non-occurrence of lateral lisps by Welsh speaking children, if Welsh LL is already ‘booked’.


In treating this sort of issue, as the child reaches the point when he or she can mostly say the target sound correctly, I often find it useful to show the child a 10 x 10 grid and tick the boxes as the target sound is said correctly. The child can see his or her achievement. There is an articulatory and habitual aspect to lisping. Derivation is not the B all and end all.

Do you have an enquiry?