Why is it that this sound which occurs in most languages is so systematically difficult and mispronounced in such a regular way in English and other languages?
What is sometimes known as a ‘lisp’
Of course, it’s not just children who have problems saying S and Z. Some adults do too. And for some of them it is a worry.
By one belief (though a possibly mistaken one), the configuration of the tongue for S and Z is very difficult. For a normal S or Z a narrow space has to be created in the roof of the mouth and the airstream directed through it. Some children then modify that configuration, mispositioning the tongue in such a way as to make the sound easier to say. I don’t believe that answer for four reasons.
First, there are two main ways in which the tongue position is misconfigured, at least by children now learning English in the South East of Britain. The commonest misconfiguration is to position the tongue slightly too far forward in the mouth, close to the position at which a complete closure is made for a Russian T or D. With an incomplete closure, the result is a sound similar or identical to the sound in TH, sometimes with a Russian lilt, for those who speak Russian. By a less common misconfiguration the tongue is brought into contact with the roof of the mouth in the midline and the air stream is allowed to pass on both sides. The effect is very similar to the sound of the pairs of L letters in the Welsh place name Llangollen. The sound is often described as a ‘lateral’. But it should be noted that this is quite different from forming a speech sound with the tongue at the side of the mouth, something that is very rare across the world’s languages. Now both of these two common misconfigurations have the same property: the acoustic frequency of the S or Z is significantly reduced. In their famous 1968 book, The Sound Pattern of English, Noam Chomsky and Morris Halle described this property, which distinguishes the TH in thin from the first sounds in sin, shin and fin, as ‘stridency’ (although they oddly don’t define stridency in terms of acoustic frequency.) On the simplest account, the misconfiguration by the tongue is an effect of the loss of stridency, rather the other way round.
Second, there is a very large number of ways in which children might misconfigure S and Z to avoid the supposedly difficult tongue tip configuration, and still get a variation of the acoustic noise spectrum sufficient to contrast with F, TH and SH. For instance, they might try moving the tongue tip further back, or by raising the back of the tongue, as for the final sound in Scottish loch. But they don’t tend to do any of these things, raising the obvious question: Why not?
Third, children learning Welsh and English sometimes have difficulty with S and Z. But hearing the lateral sound as they do in Welsh, they do not use this particular misconfiguration for S and Z.
Fourth, some people, of whom I am one, say S and Z by making the channel for the air stream, not with the tongue tip, but with the blade of the tongue, positioning the tongue tip behind the lower front teeth. I can with great effort and concentration produce something like a standard S or Z with my tongue tip just behind the upper front teeth. I just happened as a small infant to find a non-standard way of achieving the standard acoustic effect, possibly because my teeth are uncommonly tightly packed.
An alternative account
On my alternative account of what is commonly going on when S and Z are mispronounced with the tongue tip either too far forward or up against the roof of the mouth with the air stream going round the two sides, the error is by an incomplete formation of the target sound. The DERIVATION of S or Z is not completed. One vital step is either carried out incorrectly or missed. This is the step which makes the target production strident. The effect is that the acoustic noise, known as the ‘aperiodic’ because there is no well-defined single frequency, is not focused into an upper register.
This derivational account solves all of the problems of the mispositioning account. For most people the easiest way to achieve stridency is with the highly mobile and densely innervated tongue tip. But for a minority, more or less the same effect can be obtained with the middle of the tongue blade. The commonest, simplest, and most easily corrected error is by not hitting the stridency button, missing this step in the derivation. No matter where the tongue is mispositioned, there is only one basic error, .
A less common error is by replacing this step by another way of losing stridency, turning the sound into a Welsh-style lateral. These children often have enormous difficulty modifying their sound in any way.
This is not an option if this sort of sound is already ‘booked’, as it is for speakers of Welsh.
If, as I contend, the failure is derivational, rather than articulatory, this limits the phonetic possibilities, explaining why the errors are so narrowly distributed, and pronunciations like the CH in Scottish loch are seemingly unattested.
The clinical advantage of a derivational approach is that the very first step towards getting a normal S or Z can be broken down into smaller, more easily achievable steps.