![]() | ||
This is a list of Unicode characters. The latest version of Unicode contains a repertoire of more than 128,000 characters covering 135 modern and historic scripts, as well as multiple symbol sets. As it is not technically possible to list all of these characters in a single Wikipedia page, this list is limited to a subset of the most important characters for English-language readers, with links to other pages which list the supplementary characters. This page includes the 1062 characters in the Multilingual European Character Set 2 (MES-2) subset, and some additional related characters.
Contents
- Control codes
- Latin script
- Basic Latin
- Latin 1 Supplement
- Latin Extended A
- Latin Extended B
- Latin Extended Additional
- Additional Latin Extended
- IPA Extensions
- Spacing modifier letters
- Phonetic Extensions
- Combining Diacritical Marks
- Greek and Coptic
- Greek Extended
- Cyrillic
- Cyrillic supplements
- Armenian
- Semitic languages
- Thaana
- NKo
- Brahmic Indic scripts
- Georgian
- Ethiopic
- Native American scripts
- Mongolian
- Buginese
- General Punctuation
- Supplemental Arrows
- Optical Character Recognition
- Dingbats
- Braille Patterns
- Miscellaneous Mathematical Symbols
- Supplemental Mathematical Operators
- Miscellaneous Symbols and Arrows
- Chinese Japanese and Korean
- Musical symbols
- Emoji
- Alchemical symbols
- Game symbols
- References
Control codes
65 characters, including DEL but not SP. All belong to the common script.
Latin script
The Unicode Standard (version 7.0) classifies 1,338 characters as belonging to the Latin script.
Basic Latin
95 characters; the 52 alphabet characters belong to the Latin script. The remaining 43 belong to the common script.
The 33 characters classified as ASCII Punctuation & Symbols are also sometimes referred to as ASCII special characters. See § Latin-1 Supplement and § Unicode symbols for additional "special characters".
Latin-1 Supplement
96 characters; the 62 letters, and two ordinal indicators belong to the Latin script. The remaining 32 belong to the common script.
Latin Extended-A
128 characters; all belong to the Latin script.
Latin Extended-B
208 characters; all belong to the Latin script; 33 in the MES-2 subset.
Latin Extended Additional
256 characters; all belong to the Latin script; 23 in the MES-2 subset. For the rest, see Latin Extended Additional.
Additional Latin Extended
IPA Extensions
96 characters; all belong to the Latin script; three in the MES-2 subset. For the rest, see IPA Extensions.
Spacing modifier letters
80 characters; 15 in the MES-2 subset.
Phonetic Extensions
Combining Diacritical Marks
Greek and Coptic
144 code points; 135 assigned characters; 85 in the MES-2 subset.
Greek Extended
For polytonic orthography. 256 code points; 233 assigned characters, all in the MES-2 subset (#670 – 902).
Cyrillic
256 characters; 191 in the MES-2 subset.
Cyrillic supplements
Armenian
Semitic languages
Thaana
N'Ko
Brahmic (Indic) scripts
The range from U+0900 to U+0DFF includes Devanagari, Bengali script, Gurmukhi, Gujarati script, Oriya script, Tamil script, Telugu script, Kannada script, Malayalam script, and the Sinhala alphabet.
Other Brahmic and Indic scripts in Unicode include:
Georgian
Ethiopic
Native American scripts
Mongolian
Buginese
General Punctuation
112 code points; 111 assigned characters; 24 in the MES-2 subset.