This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author christian.heimes
Recipients christian.heimes, larry
Date 2013-11-28.00:07:37
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1385597257.94.0.868204336657.issue19819@psf.upfronthosting.co.za>
In-reply-to
Content
There is no ligature for "lff", just "ffl". Ligatures are treated as one char. I guess Python would have to grow a str.reverse() method to handle ligatures and combining chars correctly.

At work I ran into the issue with ligatures and combining chars multiple times in medieval and early modern age scripts. Eventually I started to normalize all incoming data to NFKC. That solves most of the issues.

s = b'ba\xef\xac\x84e'.decode('utf-8')
>>> print("".join(reversed(s)))
efflab
>>> print("".join(reversed(unicodedata.normalize("NFKC", s))))
elffab
History
Date User Action Args
2013-11-28 00:07:37christian.heimessetrecipients: + christian.heimes, larry
2013-11-28 00:07:37christian.heimessetmessageid: <1385597257.94.0.868204336657.issue19819@psf.upfronthosting.co.za>
2013-11-28 00:07:37christian.heimeslinkissue19819 messages
2013-11-28 00:07:37christian.heimescreate