A Note on Exceptional Syllable Structure in Esperanto

Marc van Oostendorp

A squib on the formal structure of loanword phonology, 1998.

0. Introduction

Phonotactic generalisations often have exceptions, usually loanwords. It is also well known that the lexicon of many natural languages can be divided into so-called strata with different morphological and phonological behaviour. Usually, these strata are a synchronic reflex of a diachronic influence of another language. The fact that Turkish has a Persian and an Arabic stratum, and English a Romance stratum, and not vice versa, can only be understood in the light of the history of the areas where these languages are spoken.

It is sometimes hard to draw the line between `exceptional loanwords' on the one hand, and lexical strata on the other. We speak about `loanwords' if the number of exceptions is small; we speak about strata if the number of exceptions is larger. Furthermore, in the ideal case more than one process can be assigned to a specific stratum. The border between these two categories thus is hard to draw almost by definition.

The fact that these lexical exceptions and strata exist, and that rules can refer to them, shows that an adequate theory of the synchronic grammar and the synchronic lexicon needs to have some way to accommodate them. It is usually assumed that whatever properties of a word are available to a speaker, their history is not among them. This squib intends to confirm that assumption. I discuss a few aspects of the phonological distinction between native words and loanwords in the lexicon of Esperanto. In a sense, practically all words in Esperanto are of course `loanwords', because the lexicon of the language has been derived of the morpheme inventory of various Indo-European languages. I will argue however, that there is some evidence that a difference between native words and loanwords is emerging even in this language.

1. Native words and loanwords

The most popular mechanism for strata in traditional generative phonology is a diacritic morphological feature, e.g. [+/-native]. Phonological and morphological rules can refer to the values of this feature. By definition, this approach can only work for strata. It is hard to see why speakers would add a morphological feature value [+native] to all new words they encounter, rather than changing the phonology.

Another problem is that certain regularities cannot be captured by this mechanism. For instance, we often find more complex structures -- more complex syllables, more complex segments (`loanphonemes'), more complex stress patterns (`lexical stress') -- in loanwords than in native morphemes. In other words, the latter class is usually more restricted phonotactically than the former. If this is a valid generalisation, it should be captured by the theory. In a theory based on the feature [+/- native] it is possible that the native stratum differs from the non-native stratum in language A because the native words have property P that is lacking in the non-native words, whereas in language B P is absent in all native words and present in all non-native forms.

Optimality Theory provides us with the means for at least a first approximation to a better understanding of the exceptions that loanwords pose to phonotactic generalisations. In this theory, there is a set Con of universal constraints; the differences between grammars can be described as a difference of constraint ranking only. Itô and Mester (1995) have suggested that we can divide (phonological) Con into two subsets. The first set contains the set of so-called faithfulness constraints, which have it that the output of the derivation may not be too different from the input; the second subset of Con consists of the pure well-formedness constraints, which are defined on the output only. We need constraints from the second set in order to explain why forms sometimes change; we need constraints from the first set in order to explain why not all forms always change to some optimal form such as [tata] of [bi].

The difference between different strata eventually is one of a difference between grammars. According to Itô and Mester, the grammars of native and non-native strata can differ in only one possible way: wellformedness constraints are ranked higher in native strata than in non-native strata vis à vis the faithfulness constraints. In this way we can get a layered structure of strata:


The well-formedness constraints are strictest in `native' stratum A; therefore, only certain simple segments, syllables and stress feet can surface. In strata B and C, more complex structures are allowed: in these strata it becomes more important to stay faithful to the input than to achieve a well-formed phonological structure. One of the interesting aspects of Itô and Mester's proposal is that it allows us, given two lexical strata in a language, to predict which of the two is native and which of the two should count as non-native. Itô and Mester's theory is set up for strata; these authors do not discuss loanwords. Probably, a word enters the lexicon usually in the outer circle, and if it stays in the language long enough, it may eventually get into it's inner circles. It is not clear to me how precisely this would work, however.

There is an obvious alternative that is subtly different from Itô and Mester's proposal. We may say that the grammar is the same for all words in the lexicon, regardless of whether they are native words, words from some stratum or loanwords: there is no reranking between different strata. The only thing that differs is the amount of structure that is specified: a lot of underlying structure is specified in loanwords, but native words have `eroded' over time. 

We should understand, then, how this lexical erosion works. If a word has a complex structure, that structure may surface due to faithfulness. But the form will still violate a lot of constraints on well-formedness. In due course, it may tend to get rid of some of the offending material, so that it can surface in such a way that it will satisfy both well-formedness and faithfulness constraints. There is thus a process of lexicon optimisation going on, and this process in this case can be interpreted as shifting in the direction of simplification. This position on the behaviour of loanwords has the advantage that it can account for smaller sets of exceptions, such as individual loanwords. It seems undesirable, for instance, to have to set up a separate grammar for two exceptions in a whole language.

Many questions remain to be answered, but both Itô and Mester's (1995) analysis and the one just sketched can be seen as a first step towards a comprehensive theory of the notions 'native' and 'non-native' stratum that is completely free of references to the history of the language. It may therefore be applied even to cases where the diachrony of the language in question is unclear or, in the extreme case, even absent. It is the purpose of the squib to show that this is useful in the analysis of a language such as Esperanto. In this language it makes no sense to distinguish diachronically between `native' words and `loanwords'. Approximately 99% of the lexicon has been derived in one way or another from the morpheme lexicon of other languages, and in this sense all words are loanwords.

Two remarks have to be made in this connection. In the first place, a very small number of (function) words seems to have been made up aribitrarily without a clear parallel in any of the source languages (e.g. tia `in that fashion', kial `why'); these we might call the real 'native' words of Esperanto. However, I see no phonological reason to set them apart. They conform to all the well-formedness constraints set up below, but their number is too small to see whether the constraints on them are in any way stricter. Another issue that needs to be mentioned in this connection, is that Esperanto lexicographers routinely distinguish between 'fundamental words' and 'neologisms'. Roughly speaking, morphemes of the first type occur in official lexicons under the authority of L. Zamenhof, the creator of Esperanto, or of the so-called Akademio de Esperanto, a group of authors and other distinguished language users. The other type of morphemes is invented by other people, and never received official recognition; these 'neologisms' may be quite old, if compared to the overall history of the language, and in principle they also can have a fairly widespread use, if compared to the total number of Esperanto speakers. Again, I see no phonological basis for the distinction between fundamental words and neologisms, and I will ignore it here.

2. Loanword phonology in Esperanto

The syllable structure of Esperanto can be almost as complex as that of English, as may become evident from words such as the following:

(2) a. trin-ki `to drink', on-klo `uncle', bran-cho 'branch'
b. a-ta-ki 'attack', o-be-i 'obey', ka-o-so 'chaos'
c. skri-bi 'to write', shtrum-po 'sock'

Most of the graphemes given here correspond to the IPA symbols; the digraph 'ch' represents a voiceles palatal affricate, the digraph 'sh' a palatal fricative. The first syllable in each of the three words in (2a) is closed, and the second is open. Apparently, both options are allowed. The words in (2b) furthermore show that onsetless syllables are also allowed, both in word-initial and in word-medial position. Finally the words in (2c) show that (word-initially) even more complex consonant clusters are allowed if the first segment is a dental fricative. All in all, we thus have the following template for the syllable in Esperanto, where optional elements occur between brackets:

  1. (s/sh) (Ci)(Cj) V (Ck)

The nucleus of the syllable is obligatorily a vowel; syllabic consonants are absent from the language. Ci and Cj have to display an increasing sonority slope, and Ci can only be a non-coronal obstruent if it is followed by another obstruent in the same word. (Words cannot end in an obstruent, but they can end in a nasal, a liquid or a voiceless coronal fricative.)

So far, there is little reason for surprise. Given the fact that all words in Esperanto are borrowed, we might even expect the syllable structure requirements to be rather lax. Furthermore, the source for most morphemes is Indo-European, and the languages from this family have a structure that conforms more or less to the template in (3). It should be noted, however, that even the superficial study of some other planned languages shows that these two factors in themselves are not sufficient. The language Volapük (created by J.M. Schleyer in 1879) for instance, had a far more restricted syllable structure: its words could only be of shape (CV)*C; all syllables had a simple onset and were open, except for the final syllable which had a simple onset and was closed. In this way, the English compound 'world speak' could turn into 'volapük'. But also in Volapük all morphemes were originally borrowed from Indo-European languages (be it of course that most of them were hard to recognize after they had been adapted to the system).

Interestingly, the Volapük system serves very well to illustrate the way standard well-formedness constraints on syllable structure work in Optimality Theory. We have constraints stating that syllables should have onsets, and that all syllables have a nucleus; there is also a constraint which states that onsets may not be complex. Furthermore there is a constraint prohibiting syllable coda's, and a constraint stating that every word has to end in a consonant. The latter two constraints are in conflict; the constraint in favour of word-final consonants wins. All of these constraints are already present in the work of Prince and Smolensky (1993); the only thing that is rather special is that in this language, these constraints dominate all the faithfulness constraints:
(4) Onset: All syllables should have an onset.
  Nucleus: All syllables should have a nucleus.
  *Complex: CC clusters are not allowed.
  NoCoda: Consonants should not occur in a coda position.
  FinalC: Every word has to end in a consonant.
/world spik /
a. vorld-spük    
b. vor-l-spük  
c. vo-la-spük    
d. + vo-la-pük        
e. vo-la-pü      
f. o-la-pük
g. volük        

The forms (5a-c) show that deletion and vowel epenthesis are necessary in order to derive a well-formed output. (5f) shows that the application of deletion and epenthesis rules is restricted but the same constraints. (5g) demonstrates that `unnecessary' deletions are also blocked. By comparing (5d) to (5e), finally, we see how two well-formedness constraints interact.

In Esperanto, faithfulness constraints obviously play a more important role than they do in Volapük. The only constraints of the list in (4) that seems inviolable is Nucleus: all syllables in the language contain at least one vowel. Still, there are certain phonological restrictions on the form of the words that cannot be directly related to the restrictions that the source language impose. Most prominent among these is the fact that the language does not allow long vowels or long consonants. If words with long consonants are borrowed, the offending segments get shortened, as can be seen in forms such as betulo 'birch' (< Lat. betulla, It. betulla), peki 'commit a sin' (< Lat. peccare, It. peccare).

For vowels it is somewhat harder to prove this point, because it is not very easy to find words in which all possible source languages have an undeniably long vowel. One instance is the word for boat. The morpheme for this concept is derived from the Germanic languages, in which it has a long vowel (Eng. boat, Germ. Boot, Du. boot). The Esperanto word is boato /bo-a-to/: the originally long vowel in this case has spread out over two syllables (where the pronunciation of the second syllable seems inspired by the orthography of the English word). And even if examples like these would not be sufficiently convincing, we can still observe that there simply are no words which have a long vowel in the Esperanto lexicon.

All in all, it seems that we can establish the following ranking of constraints: NoLongSegments >> Faith. So at least here, we seem to have a case where the core of the Esperanto lexicon obeys constraints that are slightly stricter than those of the source language.

Another such point may be the admission of superheavy syllables. In Latin and the Germanic languages words can contain a syllable with two consonants in the rhyme. Such syllables typically occur in the final position of the word. Because all Esperanto roots are followed by a vowel denoting the category membership, it should come as no surprise that most of these superheavy syllables disappear in the borrowed forms, e.g. German Bank (bank, French banque) turned into ban-ko. We cannot draw any conclusions from this, because the -o is inserted for morphological, not for phonological reasons. Yet in certain words the final syllables that are even longer than superheavy. In those cases we see the Esperanto forms showing a preference for another option.

An example of this is the Esperanto word for corps (German Korps, French corps, Russian korpus): this is not *korp-so but korpuso. The reason for the insertion of an /u/ rather than some other vowel is probably that the Latin word from which all the words in the source languages are derived is corpus. Yet this /u/ is not found in any of the other source languages of Esperanto, and furthermore most languages have another word derived from the same Latin source: corpse (French: corps, German: Körper, Russian: korpus). In order to avoid homonyms, another form is chosen, but again this is not *korpso; this time korpo is chosen. In both cases there is thus some phonotactic reason behind the choice from the source languages. The point however is that in cases like this a morpheme like *korpso is usually avoided: also in these cases, faithfulness seems outranked by well-formedness.

Importantly, exceptions can be found to both generalisations made in this section. In the first place, there is a small number of words that exceptionally contain a long consonant such as getto `ghetto' and Finno `Fin'; the latter word contrasts with fino `end'. The number of these exceptions is rather small; the examples just mentioned seem to be the only frequent ones, of which getto is also often rendered as geto. Under Itô and Mester's proposal, we could say that in these cases, faithfulness to the input is more important than the restriction against long consonants. Alternatively, we could suppose that these are the only words that have a long consonant underlyingly, and faithfulness is important throughout.

The constraint against superheavy syllables has a few exceptions: interpunkcio `interpunction', arkta `arctic', planktono `plankton', and of course the same solution is available here. It should be noted that in this case most of the words are from a rather specialized and 'learned' character, just as we would expect in the case of strata. 

Similarly, in Van Oostendorp (1998) it has been observed that certain words tend to get shortened for reasons of representational economy. For instance, the word vibracio can shorten to vibro. The sequence -aci- can get analysed as a superfluous nominalisation marker, because the final -o already denotes the nominal category. Yet this type of shortening does not occur in all cases; the informants for Van Oostendorp (1998) reported for many of the pairs of longer or shorter forms that they surmised that the longer forms might have some specialized `learned' meaning. `Difficult' words thus get a `difficult' phonology. It is possible that here we see the signs of an emerging stratum, rather than a handful of isolated loanwords. Still, it is also possible to give the shortened form a `nativized' entry in the lexicon, so that deletion is no longer necessary.

A final case of exceptional syllable structure concerns the onset. There are certain restrictions on the sequence CiCj in (3), for instance there should be a sufficiently steep sonority slope on the two consonants. The preferred complex onset seems to be one in which Ci is a stop, and Cj a liquid. On closer inspection we find a gap within this system of 'most preferred complex onsets' however. Although clusters like [pr] (problemo `problem'), [gr] (granda `big'), [kl] (klara `clear'), [bl] (blua `blue'), [tr] (trajno `train') and [dr] (droni `to drown') are quite common, the clusters *[dl] and *[tl] are absent from the language. This of course seems an inherited property of the source languages, and in these it may be due to the OCP: two coronal consonants cannot occur next to one another. In Waringhien et al. (1987) we find one counterexample to this: tlaspo, which means `thlaspi' and undeniably is a `learned' word.


In this squib I have examined evidence that it is useful to distinguish between 'regular' and 'irregular' shapes of words in Esperanto. Even in a language like this -- in which almost all words have an origin in other languages, and by definition no word has a history of more than one century -- it turns out to be possible to distinguish between `native words' and loanwords. There is not a lot of evidence yet that the loanwords are organised in separate strata. This does not mean that it is impossible that those strata will emerge: especially if the requirements on the native lexicon become stronger, certain phonological specialities may start clustering around larger groups of `learned' words. The Esperanto case is interesting because it shows that it is not so much the origin of the words that support their phonologically deviant behaviour, but rather their specialised character. There seems to be more to the special status of words into the lexicon of a language than a history of language contact.


Bastien, Louis (1950) Naùlingva etimologia leksikono [Ethymological Lexicon in Nine Languages]. Leicester, The Experanto Publishing Company Ltd.
Itô, J. & A. Mester (1995) `Japanese Phonology.' In: J. Goldsmith, ed., Handbook of Phonological Theory. Basil Blackwell, Oxford. pp. 817-838.
Oostendorp, M. van `Economy of Representation in the Esperanto Word,' Manuscript, University of Amsterdam.
Prince, A. and P. Smolensky (1993) `Optimality Theory; Constraint Internaction and Satisfaction in Generative Grammar.' Manuscript, Rutgers University & University of Colorado, Boulder.
Waringhien, G. et al. (1987) Plena Ilustrita Vortaro de Esperanto. Paris, Sennacieca Asocio Tutmonda.