Message94023
Jeff Senn wrote:
>
> Jeff Senn <senn@users.sourceforge.net> added the comment:
>
> Yikes! I just noticed that u''.title() is really broken!
>
> It doesn't really pay attention to word breaks --
> only characters that "have case".
> Therefore when there are (caseless)
> combining characters in a word it's really broken e.g.
>
>>>> u'n\u0303on\u0303e'.title()
> u'N\u0303On\u0303E'
>
> That is (where '~' is combining-tilde-over)
> n~on~e -title-cases-to-> N~On~E
Please have a look at http://bugs.python.org/issue6412 - that patch
addresses many casing issues, at least up the extent that we can
actually fix them without breaking code relying on:
len(s.upper()) == len(s)
for upper/lower/title.
If we add support for 1-n code point mappings, then we can only
enable this support by using an option to the casing methods (perhaps
not a bad idea: the parameter could be used to signal the local
to assume). |
|
Date |
User |
Action |
Args |
2009-10-14 20:16:29 | lemburg | set | recipients:
+ lemburg, loewis, senn, ezio.melotti, alexs |
2009-10-14 20:16:27 | lemburg | link | issue4610 messages |
2009-10-14 20:16:27 | lemburg | create | |
|