Xia, when saying 'unexpected', one usually needs to also say what was expected. When discussing mixed direction chars, we need to be especially careful in describing what we see with different terminals, different browsers, and different OSes.
Steven: On Windows, I see the same thing: "Daleth 1" prints as that in both IDLE's Shell and Python's REPL in Command Prompt (with D a replacement box in the latter) but is reversed here 'ܯ1' in Firefox (and the same in Microsoft Edge. But, I just discovered, the two browsers (and Notepad and LibreOffice Writer and likely other text editors) treat runs of latin digits specially: "Daleth a" pastes in that order, 'ܯa', and "Daleth 1 2" pastes as "1 2 Daleth", 'ܯ12'.
The block, but not the individual digits, is reversed. This allows R2L writers to use what are now the global digits. In Arabic, numbers are written and read R 2 L low order to high. So Europeans used to writing and reading L 2 R high to low kept the same order. Perhaps the bidi property of the digits in the unicode datebase is different from that of other latin chars.
It seems that '=' is also bidirectional, but properly not treated as digit. "Daleth = 1" is reversed in both browsers and text editors to read 'Daleth' 'equals' 'one' when read right to left.
The general rule is that blocks of same direction chars are written appropriately as encountered. It seems that the classification of some characters depends on the context. The following is as expected,
>>> 'ab'+chr(1837)+chr(1838)+chr(1839)+'cd'
'abܭܮܯcd'
with the R2L triplet reversed.
In any case, Steven is correct that Python correctly stores chars in the order given and that there is no Python bug.
|