Message140780
Indeed this seems a different issue, and might be worth fixing it.
Given this definition:
str.capitalize()¶
Return a copy of the string with its first character capitalized and the rest lowercased.
we might implement capitalize like:
>>> def mycapitalize(s):
... return s[0].upper() + s[1:].lower()
...
>>> 'fOoBaR'.capitalize()
'Foobar'
>>> mycapitalize('fOoBaR')
'Foobar'
And this would yield the correct result:
>>> s = u'\u1ff3\u1ff3\u1ffc\u1ffc'
>>> print s
ῳῳῼῼ
>>> print s.capitalize()
ῼῳῼῼ
>>> print mycapitalize(s)
ῼῳῳῳ
>>> s.capitalize().istitle()
False
>>> mycapitalize(s).istitle()
True
This doesn't happen because the actual implementation of str.capitalize checks if a char is uppercase (and not if it's titlecase too) before converting it to lowercase. This can be fixed doing:
diff -r cb44fef5ea1d Objects/unicodeobject.c
--- a/Objects/unicodeobject.c Thu Jul 21 01:11:30 2011 +0200
+++ b/Objects/unicodeobject.c Thu Jul 21 07:57:21 2011 +0300
@@ -6739,7 +6739,7 @@
}
s++;
while (--len > 0) {
- if (Py_UNICODE_ISUPPER(*s)) {
+ if (Py_UNICODE_ISUPPER(*s) || Py_UNICODE_ISTITLE(*s)) {
*s = Py_UNICODE_TOLOWER(*s);
status = 1;
} |
|
Date |
User |
Action |
Args |
2011-07-21 04:59:52 | ezio.melotti | set | recipients:
+ ezio.melotti, lemburg, belopolsky, eric.araujo, r.david.murray, py.user |
2011-07-21 04:59:52 | ezio.melotti | set | messageid: <1311224392.62.0.998821852335.issue12266@psf.upfronthosting.co.za> |
2011-07-21 04:59:52 | ezio.melotti | link | issue12266 messages |
2011-07-21 04:59:51 | ezio.melotti | create | |
|