Message 140780 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ezio.melotti
Recipients	belopolsky, eric.araujo, ezio.melotti, lemburg, py.user, r.david.murray
Date	2011-07-21.04:59:51
SpamBayes Score	2.700537e-07
Marked as misclassified	No
Message-id	<1311224392.62.0.998821852335.issue12266@psf.upfronthosting.co.za>
In-reply-to

Content
Indeed this seems a different issue, and might be worth fixing it. Given this definition: str.capitalize()¶ Return a copy of the string with its first character capitalized and the rest lowercased. we might implement capitalize like: >>> def mycapitalize(s): ... return s[0].upper() + s[1:].lower() ... >>> 'fOoBaR'.capitalize() 'Foobar' >>> mycapitalize('fOoBaR') 'Foobar' And this would yield the correct result: >>> s = u'\u1ff3\u1ff3\u1ffc\u1ffc' >>> print s ῳῳῼῼ >>> print s.capitalize() ῼῳῼῼ >>> print mycapitalize(s) ῼῳῳῳ >>> s.capitalize().istitle() False >>> mycapitalize(s).istitle() True This doesn't happen because the actual implementation of str.capitalize checks if a char is uppercase (and not if it's titlecase too) before converting it to lowercase. This can be fixed doing: diff -r cb44fef5ea1d Objects/unicodeobject.c --- a/Objects/unicodeobject.c Thu Jul 21 01:11:30 2011 +0200 +++ b/Objects/unicodeobject.c Thu Jul 21 07:57:21 2011 +0300 @@ -6739,7 +6739,7 @@ } s++; while (--len > 0) { - if (Py_UNICODE_ISUPPER(s)) { + if (Py_UNICODE_ISUPPER(s) \|\| Py_UNICODE_ISTITLE(s)) { s = Py_UNICODE_TOLOWER(*s); status = 1; }

Indeed this seems a different issue, and might be worth fixing it.
Given this definition:
  str.capitalize()¶
      Return a copy of the string with its first character capitalized and the rest lowercased.
we might implement capitalize like:
>>> def mycapitalize(s):
...     return s[0].upper() + s[1:].lower()
... 
>>> 'fOoBaR'.capitalize()
'Foobar'
>>> mycapitalize('fOoBaR')
'Foobar'

And this would yield the correct result:
>>> s = u'\u1ff3\u1ff3\u1ffc\u1ffc'
>>> print s
ῳῳῼῼ
>>> print s.capitalize()
ῼῳῼῼ
>>> print mycapitalize(s)
ῼῳῳῳ
>>> s.capitalize().istitle()
False
>>> mycapitalize(s).istitle()
True

This doesn't happen because the actual implementation of str.capitalize checks if a char is uppercase (and not if it's titlecase too) before converting it to lowercase.  This can be fixed doing:
diff -r cb44fef5ea1d Objects/unicodeobject.c
--- a/Objects/unicodeobject.c   Thu Jul 21 01:11:30 2011 +0200
+++ b/Objects/unicodeobject.c   Thu Jul 21 07:57:21 2011 +0300
@@ -6739,7 +6739,7 @@
     }
     s++;
     while (--len > 0) {
-        if (Py_UNICODE_ISUPPER(*s)) {
+        if (Py_UNICODE_ISUPPER(*s) || Py_UNICODE_ISTITLE(*s)) {
             *s = Py_UNICODE_TOLOWER(*s);
             status = 1;
         }

History
Date	User	Action	Args
2011-07-21 04:59:52	ezio.melotti	set	recipients: + ezio.melotti, lemburg, belopolsky, eric.araujo, r.david.murray, py.user
2011-07-21 04:59:52	ezio.melotti	set	messageid: <1311224392.62.0.998821852335.issue12266@psf.upfronthosting.co.za>
2011-07-21 04:59:52	ezio.melotti	link	issue12266 messages
2011-07-21 04:59:51	ezio.melotti	create