This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author senn
Recipients alexs, ezio.melotti, lemburg, loewis, senn
Date 2009-10-13.19:57:00
SpamBayes Score 6.47007e-11
Marked as misclassified No
Message-id <1255463823.3.0.0027328988228.issue4610@psf.upfronthosting.co.za>
In-reply-to
Content
Has there been any action on this? a PEP?

I disagree that using ICU is good way to simply get proper
unicode casing. (A heavy hammer for a small task...)

I agree locales are a different issue (and would prefer
optional arguments to the unicode object casing methods -- 
that could then be used within any future sort of locale object 
to handle correct casing -- but don't rely on such.)

Most of the special casing rules can be accomplished by 
a decomposition (or recursive decomposition) on the character
followed by casing the result -- so NO new table is necessary
-- only marking up the characters so implicated (there are
extra unused bits in the char type table that could be used 
for this purpose -- so no additional space needed there either).  

What remains are a tiny handful of cases that need to be handled
in code.

I have a half finished implementation of this, in case anyone
is interested.
History
Date User Action Args
2009-10-13 19:57:03sennsetrecipients: + senn, lemburg, loewis, ezio.melotti, alexs
2009-10-13 19:57:03sennsetmessageid: <1255463823.3.0.0027328988228.issue4610@psf.upfronthosting.co.za>
2009-10-13 19:57:02sennlinkissue4610 messages
2009-10-13 19:57:01senncreate