This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients Arfrever, ezio.melotti, loewis, tchrist, terry.reedy, vstinner
Date 2011-08-15.10:20:25
SpamBayes Score 1.9166706e-08
Marked as misclassified No
Message-id <1313403626.16.0.499976455012.issue12737@psf.upfronthosting.co.za>
In-reply-to
Content
So the issue here is that while using combing chars, str.title() fails to titlecase the string properly.

The algorithm implemented by str.title() [0] is quite simple: it loops through the code units, and uppercases all the chars that follow a char that is not lower/upper/titlecased.
This means that if Déme doesn't use combining accents, the char before the 'm' is 'é', 'é' is a lowercase char, so 'm' is not capitalized.
If the 'é' is represented as 'e' + '´', the char before the 'm' is '´', '´' is not a lower/upper/titlecase char, so the 'm' is capitalized.

I guess we could normalize the string before doing the title casing, and then normalize it back.
Also the str methods don't claim to follow Unicode afaik, so unless we decide that they should, we could implement whatever algorithm we want.

[0]: Objects/unicodeobject.c:6752
History
Date User Action Args
2011-08-15 10:20:26ezio.melottisetrecipients: + ezio.melotti, loewis, terry.reedy, vstinner, Arfrever, tchrist
2011-08-15 10:20:26ezio.melottisetmessageid: <1313403626.16.0.499976455012.issue12737@psf.upfronthosting.co.za>
2011-08-15 10:20:25ezio.melottilinkissue12737 messages
2011-08-15 10:20:25ezio.melotticreate