Microsoft Store
 

Arabic alphabet


 

The Arabic alphabet is the script used for writing the Arabic language. Because the Qur'an, the holy book of Islam, is written with this alphabet, its influence spread with that of Islam and it has been, and still is, used to write many other languages from families unrelated to the Semitic languages, such as Persian and Urdu. (See fuller list below.)

Presentation of the alphabet

The following table provides all of the Unicode characters for Arabic, and none of the supplementary letters used for other languages. Current browser technology still has not caught up, so some of these forms may not display correctly. The table also shows some of the many Latin-alphabet characters that have been used. There are at least a half dozen standards for transliterating Arabic characters. Multiple methods have proliferated due to various conflicting goals. See the article Arabic transliteration for more on this topic and the different transliteration methods.

Related Topics:
Unicode - Transliterating - Arabic transliteration

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

To complicate the entire question still further, there are regional differences in the way Arabic speakers pronounce the various letters, even when speaking the standard, literary language (Fusha). This chart only attempts to set forth the "standard" pronunciation as taught in universities. The phonetic equivalents are given in the Continental version of the International Phonetic Alphabet. For more details concerning the pronunciation of Arabic, consult the article Arabic phonology.

Related Topics:
Fusha - International Phonetic Alphabet - Arabic phonology

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Primary letters

Letters lacking an initial or medial version are never tied to the following letter, even within a word. As to hamza, it has only a single graphic, since it is never tied to a preceding or following letter. However, it is sometimes 'seated' on a waw, ya or alif, and in that case the seat behaves like an ordinary waw, ya or alif.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Technically, hamza is not a letter, but a diacritic.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Other characters

The following are not actual letters, but rather different orthographical shapes for letters, and in the case of the lām ʼalif, a ligature.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Notes

The {{unicode|ʼalif maqṣūra}}, commonly using Unicode 0x0649 (ى) in Arabic, is sometimes replaced in Persian or Urdu, with Unicode 0x06CC (ی), called "Farsi Yeh". This is appropriate to its pronunciation in those languages. The glyphs are identical in isolated and final form (ﻯ ﻰ), but not in initial and medial form, in which the Farsi Yeh gains two dots below (ﯾ ﯿ) while the {{unicode|ʼalif maqṣūra}} has neither an initial nor a medial form.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Writing the hamza

Initially, the letter {{unicode|ʼalif}} indicated an occlusive glottal, or glottal stop, transcribed by {{IPA|}}, confirming the alphabet came from the same Phoenician origin. Now it is used in the same manner as in other abjads, with {{unicode|yāʼ}} and {{unicode|wāw}}, as a mater lectionis, that is to say, a consonant standing in for a long vowel (see below). In fact, over the course of time its original consonantal value has been obscured, since {{unicode|ʼalif}} now serves either as a long vowel or as graphic support for certain diacritics (madda or hamza).

Related Topics:
Phoenician - Abjad

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

The Arabic alphabet now uses the hamza to indicate a glottal stop, which can appear anywhere in a word. This letter, however, does not function like the others: it can be written alone or on a support in which case it becomes a diacritic:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

  • alone: ء ;
  • with a support: إ, أ (above and under a {{unicode|ʼalif}}), ؤ (above a {{unicode|wāw}}), ئ (above a {{unicode|yāʼ}} without points or {{unicode|yāʼ hamza}}).
  • The details of writing of the hamza are discussed below, after that of the vowels and syllable-division marks, because their functions are related.

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Ligatures

The only compulsory ligature is lām+'alif. All other ligatures (yaa - mīm, etc.) are optional.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Some fonts include a Salla-llahu 'alayhi wasallam glyph:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Muslims normally use this phrase after any mention of the prophet Muhammad.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Fonts also include a special glyph for the word "li-llah", which means "to God."

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Combined with the letter 'alif, it becomes Allah:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

The latter is a work-around for the shortcomings of most text processors, which are incapable of displaying the correct vowel marks for the word "Allah". Compare the display below, which depends on your browser and installed fonts:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

: للّٰه

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Alternatively, some fonts may be designed to replace the sequence lam-lam-h?' or alif-lam-lam-h?' to the ligature U+FDF2 ARABIC LIGATURE ALLAH ISOLATED FORM, but this seems to depend on the font, with Arial, Bitstream Cyberbit and Times New Roman, as examples of the former, and Arial Unicode MS as an example of the latter. This is probably because some font designers interpret U+FDF2 as "li-llah", and others as "Allah" and so design the glyph and replacement mapping as such.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

lam-lam-h?':

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

لله

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

alif-lam-lam-h?':

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

الله

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

U+FDF2:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Diacritics

Vowels

Arabic short vowels are generally not written, except sometimes in sacred texts (such as the Qurʼan) and didactics, which are known as vocalised texts. Occasionally short vowels are marked where the word would otherwise be ambiguous and cannot be resolved simply from context.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Short vowels may be written with diacritics placed above or below the consonant that precedes them in the syllable. (All Arabic vowels, long and short, follow a consonant; contrary to appearances: there is a consonant at the start of a name like Ali — in Arabic {{unicode|ʻAlī}} — or a word like {{unicode|ʼalif}}.)

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Long "a" following a consonant other than hamzah

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

is written with a short-"a" mark on the consonant

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

plus an alif after it

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

({{unicode|ʼalif}}). Long "i" is a mark for short "i" plus

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

a yaa yāʼ, and long u is mark for short u plus waaw,

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

so aā = ā, iy = ī and uw = ū);

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Long "a" following a hamzah sound may be represented by

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

an alif-madda or by a floating hamzah followed by an alif.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

In an un-vocalised text (one in which the short

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

vowels are not marked), the long vowels are

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

represented by the consonant in question (alif, yaa, waaw).

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Long vowels written in the middle of a word are treated like consonants taking sukūn (see below) in a text that has full diacritics.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

For clarity, vowels will be placed above or below the letter د dāl so it is necessary to read the results , , , etc. Please note, د dāl is one of the six letters that do not connect to the left, and is used in this demonstration for clarity. Most other letters connect to {{unicode|ʼalif}}, {{unicode|wāw}} and {{unicode|yāʼ}}.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Other Symbols and Signs

Shadda

ّ šadda marks the gemination (doubling) of a consonant; kasra (when present) moves to between the shadda and the geminate (doubled) consonant.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Sukūn

An Arabic syllable can be open (ended by a vowel) or closed (ended by a consonant).

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

  • open: CV (long or short vowel)
  • closed: CVC (short vowel only)
  • When the syllable is closed, we can indicate that the consonant that closes it does not carry a vowel by marking it with a sign called sukūn, which takes the form "°", to remove any ambiguity, especially when the text is not vocalised: it's necessary to remember that a standard text is only composed of series of consonants; thus, the word qalb, "heart", is written qlb. Sukūn allows us to know where not to place a vowel: qlb could, in effect, be read /qVlVbV/, but written with a sukūn over the l and the b, it can only be interpreted as the form /qVlb/ (as for knowing which vowel to use, the word has to be memorised); we write this قلْبْ.

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    You might think that in a vocalised text sukūn

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    is not necessary, because the lack of vowel after

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    a consonant might be signalled by simply not writing

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    any mark above it, so قِلْبْ would be redundant. That is not so because such a convention

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    ("lack of any vowel mark means lack of vowel sound")

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    does not exist: k + u + t + b may indeed be read

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    "kutib". Such a rule would make sense if

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    everybody writing a vowel mark were forced

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    to write all vowel marks in the same word,

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    and that is not the case. In fact, you may

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    write as many or as few

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    of the vowel marks as you like.

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    In the Qurʼan, however, all vowel marks

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    must be written: there, sukuun over a letter

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    (other than the alif indicating long "a")

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    indicates that it is pronounced but not followed

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    by a short vowel, while the lack of any sign

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    over a letter (other than alif) indicates that

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    the consonant is not pronounced.

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    Outside of the Qurʼan, putting a sukuun above a

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    yaa' which indicates long ee, or above a

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    waaw which stands for long oo, is extremely rare,

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    to the point that yaa with sukuun will be unambiguously

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    read as the diphthong ai (as in English "eye") and waaw with sukuun will be read au (as in English "cow").

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    So, the word {{unicode|zauǧ}}, "husband", can be written simply {{unicode|zwǧ}} : زوج (which

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    might be also read "zooj" if such a word existed); or with sukūn

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    زوْجْ

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    which is unambiguously "zowj";

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    or with sukūn and vowels: زَوْج.

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    The letters

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    unvocalised as -->

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    {{unicode|mwsyqā}}

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    (موسيقى with a {{unicode|ʼalif maqṣūra}} at the end of the word)

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    will be read most naturally as the word "mooseekaa"

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    ("music"). If you were to write sukuuns above the waaw, yaa and alif, you'd get

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    وْسيْقىْ,

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    which looks like "mowsaykay"

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    (note that an {{unicode|ʼalif maqṣūra}} is an alif and never takes sukūn).

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    You cannot place a sukuun on the final letter ǧ of "zawǧ" even if you don't pronounce a vowel there, because fully vocalised texts are always written as if the ighraab vowels were in fact pronounced, and this word can never have a sukuun as an ighraab. Let's take the sentence "aḥmad zawǧ šarr", meaning "Ahmed is a bad husband". The theoretical pronunciation with the ighraab vowels is "aḥmadun zauǧun šarrun". Interestingly, regardless of the fact that most people say "aḥmad zauǧ šarr", you cannot write the mark for sukuun over that j; you either leave it markless, or use the mark for "un". By the same token, you can leave the final r of this sentence either completely unmarked or topped with a shadda plus "un", but a sukuun never belongs there, regardless of the fact that the only correct pronunciation of "šarrun" at the end of an utterance is "šar".

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~