
Discrete Infinity
Building an Infinite Set of Structures from a Finite Resource
The number of phonemes or sounds in a language and at any given moment, the number of the entries in anyone’s lexicon is finite or discrete. But the number of ways in which they can be combined is infinite. This is known as ‘discrete infinity’, discrete because the structure is built from a small, finite number of elements, infinity because the number of possible sentences is infinite. The idea of discrete infinity was originally developed by Wilhelm von Humboldt (1836), published a year after his death.
Consider these:
- You lie.
- You know her.
- Who do you know?
- You seem to know her.
- Who asked who you seem to know?
- We know who asked who you seem to know.
- You think (that) we know who asked who you seem to know.
- I suspect (that) you think (that) we know who asked who you seem to know.
- I’m sorry to say (that) I suspect (that) you think (that) we know who asked who you seem to know.
And so on. A simple sentence can be extended indefinitely. We can go on multiplying the variations any number of times. This is is known as ’embedding’ or ‘recursion’ (or more traditionally as ‘subordination’. Structure is ‘derived’ he most deeply embedded part of the structure – reversing the idea by the traditional term of ‘subordination’. As the derivation proceeds, the sentence gets harder to understand, but there is no point at which it ceases to be English. And not just English, but any one of the other seven thousand or so languages spoken around the world. Significantly, it was Humboldt who launched the idea of expanding the set languages from Latin, Greek, and Hebrew, only the last unrelated to the languages of Europe, Humboldt started with Basque and Javanese.
Obviously, as embedding proceeds, the resulting structure becomes increasingly hard to process. But there is no point at which it ceases to be some given instantiation of human language. Hence discrete infinity. The repetition of the same sort of structure is known as ‘recursion’.
A counter-claim
One linguist, Daniel Everett, claims that there is a language, Pirahã, which disallows even the first step of embedding. He claims that Pirahã speakers never refer to anything other than what is currently in the immediate frame of reference. Pirahã is spoken by a Brazilian tribe of less than 500 people who had studiously avoided contact with the outside world until Everett persuaded them to allow him and his family to join them. On Everett’s account, Pirahã speakers can hardly discuss or question reports of untruth with any nuance because their language makes this impossible. Really? If so, why have the Pirahã so carefully avoided contact with the developed world? Everett’s account of Pirahã is cited approvingly by Tom Wolfe (2018). But Everett’s account is demolished on both theoretical and empirical grounds by Andrew Nevins, David Pesetsky & Cilene Rodrigues (2007). Most linguists accept that demolition. Anything else would be weird given that Everett is querying a generalisation which appears to hold of every other language in the world, and his data concerns a language which is clearly very threatened. As a language approaches the point which Pirahã is now sadly at, it often loses key aspects of its grammar. This is noted about languages much less threatened than Pirahã. Everett’s claim is thus highly suspect.