Bidirectional Text
Middle eastern languages such as Hebrew and Arabic are written predominantly right-to-left. Numbers are written with the most significant digit left-most, just as in European or other left-to-right text. Languages written in left-to-right scripts are often mixed in, so the complete document is bidirectional in nature, a mix of both right-to-left (RTL) and left-to-right (LTR) writing. Text written in the Hebrew and Arabic languages is often referred to as bidirectional, or “bidi” for short.
RTL in HTML, XHTML, CSS2, CSS3
HTML, XHTML
The HTML DIR attribute specifies the base direction (LTR, RTL) of text, or sections of text. The base direction can influence the ordering of the display of runs of text of different directions, and the display of directionally neutral text (i.e., characters or sequences of characters that do not have inherent directionality, as defined in the Unicode Character Standard).
Examples Using The DIR Attribute
<p DIR=”LTR”>
Demonstration of DIR Attribute On Bidirectional Text
Here is a sentence, set first in a paragraph with left-to-right direction by the HTML DIR attribute (<p DIR=”LTR”>) and then repeated without change in a paragraph with right-to-left direction (<p DIR=”RTL”>).
Note the use of an opening parenthesis “(” and closing square bracket “]” is intentional to demonstrate a point.
(This example requires a browser that supports bidirectional text rendering. Microsoft IE supports bidirectional text rendering. Display this page in different browsers to compare. Here is a graphic image of this example you can use for comparison or to see the expected results: Demonstration of DIR Attribute On Bidirectional Text)
This example shows how browsers that support bidirectional text behave with different DIR attribute settings.
| HTML Markup | Resulting Display |
|---|---|
| <p dir=”LTR”>He said “שלם” (shalom] to me.</p>
<p dir=”RTL”>He said “שלם” (shalom] to me.</p> |
He said “שלם” (shalom] to me. He said “שלם” (shalom] to me. |
| <p dir=”LTR”>Najib said “السلام عليكم” (as-salaam alaykum] to me.</p>
<p dir=”RTL”>Najib said “السلام عليكم” (as-salaam alaykum] to me.</p> |
Najib said “السلام عليكم” (as-salaam alaykum] to me. Najib said “السلام عليكم” (as-salaam alaykum] to me. |
The changes due to the DIR attribute in this example, are:
- changes the alignment of the paragraph from left-aligned to right-aligned.
- reorders the relative placement of runs of text of different direction. The first run (“He said”) is placed right-aligned as expected in the RTL paragraph. The next run of RTL text (the Hebrew word shalom) is placed to the left of the first run, continuing the right to left sequencing. It seems wrong to English speakers but is correct from the view that moving leftward is advancing in a right-to-left writing system. For the same reason, the period appears left-most.
- Parentheses and square brackets do not have an inherent direction. The open parenthesis is between LTR and RTL text runs and so cannot “inherit” the direction of the surrounding text. It therefore defaults to the RTL base direction of the paragraph and is placed to the left of the Hebrew word shalom. Note the closing square bracket is embedded in a single run of left-to-right text. It therefore adopts the direction of its surrounding text and is placed to the right of the English word shalom.
- Note that the open parenthesis is rendered “mirror-imaged”. That is, it looks like a closing parenthesis in English. Parentheses and Square Brackets, and some other characters, are displayed as mirror-images (i.e. reversed) when the text direction is right-to-left. This is because the characters have the semantics of open and close, not the graphically left and right. An open parenthesis or square bracket, in right-to-left text, is rendered as a mirror-image so that it visually encloses the text to its left. Similarly, closing parentheses and brackets are also mirror-imaged, so they face the text to their right.
DIR also specifies the directionality of tables.
Example of TABLE direction using DIR=”LTR”
| European Digits | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|---|
| Arabic Digits | ٠ | ١ | ٢ | ٣ | ٤ | ٥ | ٦ | ٧ | ٨ | ٩ |
For example, if DIR=”RTL” is added to the <TABLE> element in the table above displaying the European and Arabic digits, the table is changed to look like this:
Example of TABLE direction using DIR=”RTL”
| European Digits | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|---|
| Arabic Digits | ٠ | ١ | ٢ | ٣ | ٤ | ٥ | ٦ | ٧ | ٨ | ٩ |
Authors can turn off the bidirectional algorithm for selected fragments of text with the BDO Element. It is useful for special situations, such as representing part numbers, where the Unicode Bidirectional Algorithm may not apply.
CSS2
CSS2 has a direction property analogous to the HTML DIR atribute. It also introduced a unicode-bidi property for additional control of the Unicode Bidirectional Algorithm (UAX9).
CSS3 Text Module
CSS3 has a direction property for specification of left-to-right and right-to-left “in-line flow”. It also has a unicode-bidi property for control of the Unicode Bidirectional Algorithm (UAX9).
CSS3 has a property for specifying block-progression flow and orientation. The block-progression property can specify vertical flow of top-to-bottom or horizontal flow of left-to-right or right-to-left.
The writing-mode property is a short-hand way to specify both direction and block-progression.
CSS3 also provides a way to turn characters 90, 180, and 270 degrees: Glyph orientation within a text run.
CSS3 has a feature that influences rendering of Arabic text, the ‘text-kashida-space property‘. Although Kashida is an Arabic character, in this context “Kashida is a typographic effect used in Arabic writing systems that allows glyph elongation at some carefully chosen points”.
Tips For Representing Right-To-Left Text In Markup Languages
- Set the overall document direction on the HTML element, not the BODY element.
- Use character encodings that employ logical not visual ordering, such as Unicode, Windows-1255, Windows-1256, ISO-8859-6-i, ISO 8859-8-i.
Don’t use the visually ordered: ISO 8859-6, ISO 8859-8, ISO-8859-6-e, ISO-8859-8-e. See RFC 1555. - Use markup instead of Unicode bidirectional control characters.
Unicode
Character
NameScalar
ValueFunction Equivalent Markup LRE U+202A Left-to-Right Embedding DIR attribute e.g. DIR=”LTR” RLE U+202B Right-to-Left Embedding DIR attribute e.g. DIR=”RTL” PDF U+202C Pop Directional Format No Equivalent </BDO> ends override LRO U+202D Left-to-Right Override BDO Element e.g. <BDO dir=”LTR”> RLO U+202E Right-to-Left Override BDO Element e.g. <BDO dir=”RTL”>
See Unicode in XML and other Markup Languages (UTR 20)
.
Naming Conventions
The W3C Internationalization Group is now recommending to specification writers that terms such as “property-left” and “property-right” be avoided in favor of terms such as “property-before” and “property-after“. When the writing direction changes, for example from left-to-right to either top-to-bottom or right-to-left, “before” and “after” are still correct and do not need to be modified. (This is true for most W3C specifications purposes. Your functionality may vary.)
Links
- User Interfaces for Right-to-Left Languages
- Example Hebrew Web Page – Shema Yisrael
- W3C International WG GEO FAQ: Script Direction and Languages
- Hebrew Characters And Their Unicode Values
- Hebrew Numbers
- HTML 4.01
- CSS3 Text Module
- CSS 2.1
- CSS 2
- Unicode Consortium FAQ on Middle Eastern Scripts and Languages
- Unicode Bidirectional Algorithm (UAX9)
- Unicode in XML and other Markup Languages (UTR 20)









Comment about Right-To-Left Text in Markup Languages with Facebook