Rahul Sharma (Editor)

Arabic script in Unicode

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

As of Unicode 9.0, the Arabic script is contained in the following blocks:

Contents

  • Arabic (0600–06FF, 255 characters)
  • Arabic Supplement (0750–077F, 48 characters)
  • Arabic Extended-A (08A0–08FF, 73 characters)
  • Arabic Presentation Forms-A (FB50–FDFF, 611 characters)
  • Arabic Presentation Forms-B (FE70–FEFF, 141 characters)
  • Rumi Numeral Symbols (10E60–10E7F, 31 characters)
  • Arabic Mathematical Alphabetic Symbols (1EE00—1EEFF, 143 characters)
  • The basic Arabic range encodes the standard letters and diacritics, but does not encode contextual forms (U+0621–U+0652 being directly based on ISO 8859-6); and also includes the most common diacritics and Arabic-Indic digits. The Arabic Supplement range encodes letter variants mostly used for writing African (non-Arabic) languages. The Arabic Extended-A range encodes additional Qur'anic annotations and letter variants used for various non-Arabic languages. The Arabic Presentation Forms-A range encodes contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. The Arabic Presentation Forms-B range encodes spacing forms of Arabic diacritics, and more contextual letter forms. The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text. The Arabic Mathematical Alphabetical Symbols block encodes characters used in Arabic mathematical expressions.

    Contextual forms

    A demonstration for the basic alphabet used in Modern Standard Arabic:

    Punctuation and ornaments

    Only the Arabic question mark ⟨؟⟩ and the Arabic comma ⟨،⟩ are used in regular Arabic script typing and the comma is often substituted for the Latin script comma (,).

  • U+060C ،‎ ARABIC COMMA
  • U+060D ؍‎ ARABIC DATE SEPARATOR
  • U+060E ؎‎ ARABIC POETIC VERSE SIGN
  • U+060F ؏‎ ARABIC SIGN MISRA
  • U+061F ؟‎ ARABIC QUESTION MARK
  • U+066D ٭ Arabic five pointed star
  • U+06DD ۝‎ ARABIC END OF AYAH
  • U+06DE ۞‎ ARABIC START OF RUB EL HIZB
  • U+06E9 ۩‎ ARABIC ARABIC PLACE OF SAJDAH
  • U+FD3E Arabic ornate left parenthesis
  • U+FD3F ﴿ Arabic ornate right parenthesis
  • Word ligatures

    Arabic Presentation Forms-A has a few characters defined as "word ligatures" for terms frequently used in formulaic expressions in Arabic. They are rarely used out of professional liturgical typing, also the Rial grapheme is normally written fully, not by the ligature.

  • U+FDF0 ﷰ‎ ARABIC LIGATURE SALLA USED AS KORANIC STOP SIGN ISOLATED FORM (صلے)
  • U+FDF1 ﷱ‎ ARABIC LIGATURE QALA USED AS KORANIC STOP SIGN ISOLATED FORM (قلے)
  • U+FDF2 ﷲ‎ ARABIC LIGATURE ALLAH ISOLATED FORM (الله)
  • U+FDF3 ﷳ‎ ARABIC LIGATURE AKBAR ISOLATED FORM (اكبر), as in the phrase الله أكبر Allāhu akbar
  • U+FDF4 ﷴ‎ ARABIC LIGATURE MOHAMMAD ISOLATED FORM (محمد)
  • U+FDF5 ﷵ‎ ARABIC LIGATURE SALAM ISOLATED FORM (صلعم, the abbreviation for صلى الله عليه وسلم "peace be upon him")
  • U+FDF6 ﷶ‎ ARABIC LIGATURE RASOUL ISOLATED FORM (رسول)
  • U+FDF7 ﷷ‎ ARABIC LIGATURE ALAYHE ISOLATED FORM (عليه)
  • U+FDF8 ﷸ‎ ARABIC LIGATURE WASALLAM ISOLATED FORM (وسلم)
  • U+FDF9 ﷹ‎ ARABIC LIGATURE SALLA ISOLATED FORM (صلى)
  • U+FDFA ﷺ‎ ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM (صلى الله عليه وسلم "peace be upon him")
  • U+FDFB ﷻ‎ ARABIC LIGATURE JALLAJALALOUHOU (جل جلاله)
  • U+FDFC ﷼‎ RIAL SIGN (ريال)
  • U+FDFD ﷽‎ ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM (بسم الله الرحمن الرحيم bism-i llāh-i r-raḥmān-i r-raḥīm)
  • Arabic Presentation Forms A

    They are mostly ligatures which can be created from the previous charts' characters, with the exception of the bracket-like graphemes ﴾ ﴿ and some of them are ligatures of common liturgical phrases.

    Arabic Presentation Forms B

    These can all be created from the basic chart's characters.

    References

    Arabic script in Unicode Wikipedia