Puneet Varma (Editor)

Persian phonology

Updated on
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

The Persian language has six vowel phonemes and twenty-three consonant phonemes. It features contrastive stress and syllable-final consonant clusters.



/e/ is pronounced between the vowel of bate (for most English dialects) and the vowel of bet; /o/ is pronounced between the vowel of boat (for most English dialects) and the vowel of raw.

Word-final /o/ is rare except for تو ту /to/ ('you' [singular]), loanwords (mostly of Arabic origin), and proper and common nouns of foreign origin, and word-final /æ/ is very rare in Iranian Persian, an exception being نه на /næ/ ('no'). The word-final /æ/ in Early New Persian mostly shifted to /e/ in contemporary Iranian Persian (often romanized as ⟨eh⟩, meaning [e] is also an allophone of /æ/ in word-final position in contemporary Iranian Persian), but is preserved in the Eastern dialects.

The chart to the right reflects the vowels of many educated Persian speakers from Tehran.

The three vowels /æ/, /e/ and /o/ are traditionally referred to as 'short' vowels and the other three (/ɒː/, /iː/ and /uː/) as 'long' vowels. In fact the three 'short' vowels are short only when in an open syllable (i.e. a syllable ending in a vowel) that is non-final (but can be unstressed or stressed), e.g. صدا садо [seˈdɒː] 'sound', خدا Худо [xoˈdɒː] 'God'. In a closed syllable (i.e. a syllable ending in a consonant) that is unstressed, they are around sixty percent as long as a long vowel; this is true for the 'long' vowel /iː/ as well. Otherwise the 'short' and 'long' vowels are all pronounced long. Example: سفتر сафтар [seˑfˈtʰæːɾ] 'firmer'.

When the short vowels are in open syllables, they are also unstable and tend in informal styles to assimilate in quality to the following long vowel. Thus دویست дувест [deˈviːst] 'two hundred' becomes [diˈviːst], شلوغ шулуғ [ʃoˈluːɢ] 'crowded' becomes [ʃuˈluːɢ], رسیدن расидан [ræsiːˈdæːn] 'to arrive' becomes [resiːˈdæːn] and so on.


The status of diphthongs in Persian is disputed. Some authors list ei̯, ou̯, āi̯, oi̯, ui̯, others list only two ei̯ and ou̯, but some do not recognize diphthongs in Persian altogether. A major factor that complicates the matter is the change of two classical and pre-classical Persian diphthongs: ai̯ > ei̯, au̯ > ou̯. This shift occurred in Iran but not in some modern varieties (particularly of Afghanistan). Morphological analysis also supports the view that the alleged Persian diphthongs are combinations of the vowels with /j/ and /w/.

The Persian orthography does not distinguish between the diphthongs and the consonants /j/ and /w/; that is, they both are written with ی and و respectively.

/ow/ becomes [oː] in colloquial Tehrani dialect but is preserved in other Western dialects and standard Iranian Persian.

Spelling and example words

For Western Persian:

The variety of Afghanistan has preserved as well these two Classic Persian vowels:

In the modern Persian alphabet, the short vowels /e/, /o/, /æ/ are usually not written, as is normally done in Arabic alphabet. (See Arabic phonology § Vowels.)

Historical shifts

Early New Persian inherited from Middle Persian eight vowels: three short i, a, u and five long ī, ē, ā, ō, ū (in IPA: /i a u/ and /iː eː aː oː uː/). It is likely that this system passed into the common Persian era from a purely quantitative system into one where the short vowels differed from their long counterparts also in quality: i > /ɪ/; u > /ʊ/; ā > /ɑː/. These quality contrasts have in modern Persian varieties become the main distinction between the two sets of vowels.

The inherited eight-vowel inventory is retained without major upheaval in Dari, the only systematic innovation being the lowering of the lax close front i and u to mid vowels /e/ and /o/.

In Western Persian, two of the vowel contrasts have been lost: those between the tense mid and close vowels. Thus ē, ī have merged as /iː/, while ō, ū have merged as /uː/. In addition, similarly to Dari, the lax close vowels have become mid: i > /e/, u > /o/. The lax open vowel has become fronted: a > /æ/, and in word-final position further raised to /e/.

In both varieties ā is more or less labialized.

Tajiki has also lost two of the vowel contrasts, but differently from Western Persian: here the tense/lax contrast among the close vowels has been eliminated. That is, i, ī have merged as /i/, and u, ū have merged as /u/. The other tense back vowels have shifted as well. Mid ō has become more front: /ɵ/ or /ʉ/, a vowel usually romanized as ů. Open ā has become a mid, labial vowel /o/.

Loanwords from Arabic generally abide to these shifts as well.

The following chart summarizes the later shifts into modern Tajik, Dari, and Western Persian.

Allophonic variation

Alveolar stops /t/ and /d/ are either apical alveolar or laminal denti-alveolar. The voiceless obstruents /p, t, tʃ, k/ are aspirated much like their English counterparts: they become aspirated when they begin a syllable, though aspiration is not contrastive. The Persian language does not have syllable-initial consonant clusters (see below), so unlike in English, /p, t, k/ are aspirated even following /s/, as in هستم ҳастам /hæstæm/ ('I exist'). They are also aspirated at the end of syllables, although not as strongly.

The velar stops /k, ɡ/ are palatalized before front vowels or at the end of a syllable.

In Classical Persian, غ ғ and ق қ denoted the original Arabic phonemes, the voiced uvular fricative [ʁ] and the voiceless uvular stop [q], respectively. In modern Tehrani Persian (which is used in the Iranian mass media, both colloquial and standard), there is no difference in the pronunciation of غ and ق, and they are both normally pronounced as a voiced uvular stop [ɢ]. The classic pronunciations of غ ғ and ق қ are preserved in the eastern varieties, Dari and Tajiki, as well as in the southern varieties (e.g. Zoroastrian Dari language and other Central / Central Plateau or Kermanic languages).

The alveolar flap /ɾ/ has a trilled allophonic variant [r] at the beginning of a word, as in Spanish, Catalan, and other Romance languages in Spain (it can be a free variation between a trill [r] and a flap [ɾ]); the trill [r] as a separate phoneme occurs word-medially especially in loanwords of Arabic origin as a result of gemination of [ɾ]. An alveolar approximant [ɹ] also occurs as an allophone of /ɾ/ before /t, d, s, z, ʃ, l/, and /ʒ/; [ɹ] is sometimes in free variation with [ɾ] in these and other positions, such that فارسی Форсӣ ('Persian') is pronounced [fɒːɹˈsiː] or [fɒːɾˈsiː] and سقرلات сақирлот ('scarlet') becomes [sæɣeˑɹˈlɒːt] or [sæɣeˑɾˈlɒːt]. /r/ is sometimes realized as a long approximant [ɹː].

The velar nasal [ŋ] is an allophone of /n/ before /k,g/

/f, k, s, ʃ, x/ may be voiced to, respectively, [v, ɡ, z, ʒ, ɣ] before voiced consonants; /n/ may be bilabial [m] before bilabial consonants. Also /b/ may in some cases change into [β], or even [v]; for example باز боз ('open') may be pronounced [bɒːz] as well as [vɒːz] or [vɒː], colloquially.

Dialectal variation

The pronunciation of و в [w] in Classical Persian shifted to [v] in Iranian Persian, but is retained in Dari or Afghan Persian. In modern Persian [w] is lost if preceded by a consonant and followed by a vowel in one whole syllable, e.g. خواب хоб /x(w)ɒb/ 'sleep', as Persian has no syllable-initial consonant clusters (see below).

Spelling and example words

Consonants can be geminated, often in words from Arabic. This is represented in the IPA either by doubling the consonant, سیّد саййид [sejjed], or with the length marker ⟨ː⟩, [sejːed].

Syllable structure

Syllables may be structured as (C)(S)V(S)(C(C)).

Persian syllable structure consists of an optional syllable onset, consisting of one consonant; an obligatory syllable nucleus, consisting of a vowel optionally preceded by and/or followed by a semivowel; and an optional syllable coda, consisting of one or two consonants. The following restrictions apply:

  • Onset
  • Consonant (C): Can be any consonant. (Onset is composed only of one consonant; consonant clusters are only found in loanwords, sometimes an epenthetic /æ/ is inserted between consonants.)
  • Nucleus
  • Semivowel (S)
  • Vowel (V)
  • Semivowel (S)
  • Coda
  • First consonant (C): Can be any consonant.
  • Second consonant (C): Can also be any consonant (mostly /d/, /k/, /s/, /t/, & /z/).
  • Word Accent

    The Persian word-accent has been described as a stress accent by some, and as a pitch accent by others. In fact the accented syllables in Persian are generally pronounced with a raised pitch as well as stress; but in certain contexts words may become deaccented and lose their high pitch.

    From an intonational point of view, Persian words (or accentual phrases) usually have the intonation (L +) H* (where L is low and H* is a high-toned stressed syllable), e.g. کتاب китоб /keˈtɒ́b/ 'book'; unless there is a suffix, in which case the intonation is (L +) H* + L, e.g. کتابم китобам /keˈtɒ́b-æm/ 'my book'. The last accent of a sentence is usually accompanied by a low boundary tone, which produces a falling pitch on the last accented syllable, e.g. کتاب بود китоб буд /keˈtɒ̂b buːd/ 'it was a book'.

    When two words are joined in an اضافه изофа ezafe construction, they can either be pronounced accentually as two separate words, e.g. مردم اینجا мардуми инҷо /mærˈdóm-e inˈd͡ʒɒ́/ 'the people (of) here', or else the first word loses its high tone and the two words are pronounced as a single accentual phrase: /mærˈdom-e inˈd͡ʒɒ́/. Words also become deaccented following a focused word; for example, in the sentence نامۀ مامانم بود رو میز Номии момонам буд ру миз /nɒˈme-ye mɒˈmɒn-æm bud ru miz/ 'it was my mom's letter on the table' all the syllables following the word مامان момон /mɒˈmɒn/ 'mom' are pronounced with a low pitch.

    Knowing the rules for the correct placement of the accent is essential for proper pronunciation.

    1. Accent is heard on the last stem-syllable of most words.
    2. Accent is heard on the first syllable of interjections, conjunctions and vocatives. E.g. بله бали /ˈbæle/ ('yes'), نخیر /ˈnæxeir/ нахайр ('no, indeed'), ولی валӣ /ˈvæli/ ('but'), چرا /ˈtʃeɾɒ/ чиро ('why'), اگر агар /ˈæɡæɾ/ ('if'), مرسی /ˈmeɾsi/ мерси ('thanks'), خانم /ˈxɒnom/ хонум ('Ma'am'), آقا /ˈɒɢɒ/ оқо ('Sir'); cf. 4-4 below.
    3. Never accented are:
      1. personal suffixes on verbs (/-æm/ ('I do..'), /-i/ ('you do..'), .., /-ænd/ ('they do..') (with one exception, cf. 4-1 below);
      2. a small set of very common noun enclitics: the /ezɒfe/ اضافه изофа (/-e/, /-je) ('of'), /-ɾɒ/ a direct object marker, /-i/ ('a'), /-o/ ('and');
      3. the possessive and pronoun-object suffixes, /-æm/, /-et/, /-eʃ/, &c.
    4. Always accented are:
      1. the personal suffixes on the positive future auxiliary verb (the single exception to 3-1 above);
      2. the negative verb prefix /næ-/, /ne-/, if present;
      3. if /næ-/, /ne-/ is not present, then the first non-negative verb prefix (e.g. /mi-/ ('-ing'), /be-/ ('do!') or the prefix noun in compound verbs (e.g. کار кор /kɒr/ in کار می‌کردم кор мекардам /ˈkɒr mi-kærdæm/);
      4. the last syllable of all other words, including the infinitive ending /-æn/ and the participial ending /-te/, /-de/ in verbal derivatives, noun suffixes like /-i/ ('-ish') and /-eɡi/, all plural suffixes (/-hɒ/, /-ɒn/), adjective comparative suffixes (/-tæɾ/, /-tæɾin/), and ordinal-number suffixes (/-om/). Nouns not in the vocative are stressed on the final syllable: خانم хонум /xɒˈnom/ ('lady'), آقا оқо /ɒˈɢɒ/ ('gentleman'); cf. 2 above.
    5. In the informal language, the present perfect tense is pronounced like the simple past tense. Only the word-accent distinguishes between these tenses: the accented personal suffix indicates the present perfect and the unstressed one the simple past tense:

    Colloquial Iranian Persian

    When spoken formally, Iranian Persian is pronounced as written. But colloquial pronunciation as used by all classes makes a number of very common substitutions. Note that Iranians can interchange colloquial and formal sociolects in conversational speech. They include:

  • In the Tehrani accent and also most of the accents in Central and Southern Iran, the sequence /ɒn/ in the colloquial language is nearly always pronounced [un]. The only common exceptions are high prestige words, such as قرآن Қуръон [ɢoɾʔɒn] ('Qur'an'), and ایران Эрон [ʔiˈɾɒn] ('Iran'), and foreign nouns (both common and proper), like the Spanish surname بلتران Beltran [belˈtɾɒn], which are pronounced as written. A few words written as /ɒm/ are pronounced [um], especially forms of the verb آمدن /ɒmædæn/ омадан ('to come').
  • In the Tehrani accent, the unstressed direct object suffix marker را ро /ɾɒ/ is pronounced /ɾo/ after a vowel, and /o/ after a consonant.
  • The stems of many verbs have a short colloquial form, especially است аст /æst/ ('he/she is'), which is colloquially shortened to /e/ after a consonant or /s/ after a vowel.
  • The 2nd and 3rd person plural verb subject suffixes, written /-id/ and /-ænd/ respectively, are pronounced [-in] and [-æn].
  • Many frequently-occurring verbs are shortened, such as می‌خواهم /mixɒːhæm/ мехоҳам ('I want') → [mixɒːm], and می‌روم /miɾævæm/ меравам ('I go'_ → [miɾæm].
  • References

    Persian phonology Wikipedia

    Similar Topics
    Michael Card
    Dmitri Monya
    Zoe Stark