This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author fbacher
Recipients christian.heimes, fbacher, serhiy.storchaka
Date 2022-01-06.23:48:30
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1641512911.04.0.200930795464.issue46264@roundup.psfhosted.org>
In-reply-to
Content
Oh joy. Kodi media server is having unicode issues and this won't help. I'm trying to see how bad it is.

The main use for case transformations is for internal keyword lookup/monocasing. Settings, filenames on moncased filesystems, etc. are caseless. On the main things work okay until you hit a language, such as Turkish, that does not obey the usual rules. So, ToLower('I') does not map to 'i'. There are ways to work around this, but it depends upon the robustness of the unicode implementation.

I've spent the past several days looking into C++ behavior. It seemed to be similarly broken until I discovered that writing to both cout and wcout tends to break things, including unicode encoding.


It will take a few days to investigate further. Thanks for the info.
History
Date User Action Args
2022-01-06 23:48:31fbachersetrecipients: + fbacher, christian.heimes, serhiy.storchaka
2022-01-06 23:48:31fbachersetmessageid: <1641512911.04.0.200930795464.issue46264@roundup.psfhosted.org>
2022-01-06 23:48:31fbacherlinkissue46264 messages
2022-01-06 23:48:30fbachercreate