Samiksha Jaiswal (Editor)

Cyrillic script in Unicode

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

As of Unicode version 9.0 Cyrillic script is encoded across several blocks, all in the BMP:

  • Cyrillic: U+0400–U+04FF, 256 characters
  • Cyrillic Supplement: U+0500–U+052F, 48 characters
  • Cyrillic Extended-A: U+2DE0–U+2DFF, 32 characters
  • Cyrillic Extended-B: U+A640–U+A69F, 96 characters
  • Cyrillic Extended-C: U+1C80–U+1C8F, 9 characters
  • Phonetic Extensions: U+1D2B, U+1D78, 2 Cyrillic characters
  • Combining Half Marks: U+FE2E–U+FE2F, 2 Cyrillic characters
  • The characters in the range U+0400–U+045F are basically the characters from ISO 8859-5 moved upward by 864 positions. The next characters in the Cyrillic block, range U+0460–U+0489, are historical letters, some being still used for Church Slavonic. The characters in the range U+048A–U+04FF and the complete Cyrillic Supplement block (U+0500-U+052F) are additional letters for various languages that are written with Cyrillic script. Two characters in the block Phonetic Extensions block complete the Uralic Phonetic Alphabet: U+1D2B CYRILLIC LETTER SMALL CAPITAL EL and U+1D78 MODIFIER LETTER CYRILLIC EN.

    Unicode includes few precomposed accented Cyrillic letters; the others can be combined by adding U+0301 ("combining acute accent") after the accented vowel (e.g., ы́ э́ ю́ я́) (see below).

    The following two diacritical marks not specific to Cyrillic can be used with Cyrillic text:

  • U+0301 ◌́ COMBINING ACUTE ACCENT (= Cyrillic stress mark), in Combining Diacritical Marks block U+0300–U+036F
  • U+20DD ◌⃝ COMBINING ENCLOSING CIRCLE (= Cyrillic ten thousands sign), in Combining Diacritical Marks for Symbols block U+20D0–U+20F0
  • In the table below, small letters are ordered according to their Unicode numbers; capital letters are placed immediately before the corresponding small letters. Standard Unicode names and canonical decompositions are included.

    Blocks

    The Cyrillic block (U+0400 – U+04FF) was added to the Unicode Standard in October, 1991 with the release of version 1.0:

    The Cyrillic Supplement block (U+0500 – U+052F) was added to the Unicode Standard in March, 2002 with the release of version 3.2:

    The Cyrillic Extended-A (U+2DE0 – U+2DFF) and Cyrillic Extended-B (U+A640 – U+A69F) blocks were added to the Unicode Standard in April, 2008 with the release of version 5.1:

    The Cyrillic Extended-C block (U+1C80 – U+1C8F) was added to the Unicode Standard in June, 2016 with the release of version 9.0:

    References

    Cyrillic script in Unicode Wikipedia