Message 108513 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	skrah
Recipients	eric.smith, loewis, pitrou, skrah
Date	2010-06-24.14:47:28
SpamBayes Score	5.9450035e-06
Marked as misclassified	No
Message-id	<20100624144713.GA1188@yoda.bytereef.org>
In-reply-to	<1277040424.45.0.292669416337.issue9036@psf.upfronthosting.co.za>

Content
Antoine Pitrou <report@bugs.python.org> wrote: > > Thus, > > ((unsigned char)((c) & 0xff)) and ((unsigned char)(c)) should produce > > the same results. > > If it's the case, it's probably optimized away by the compiler anyway. Yes, the asm output is the same. I was more concerned about readability. Historically, Py_CHARMASK started as ((c) & 0xff), hence the name. In r34974 UCHAR_MAX = 2*CHAR_BIT-1 = 255 was enforced. In r61936 the cast was added. > > There is no reason not to do the cast when __CHAR_UNSIGNED__ is > > defined (it will be a no-op). > > Agreed. It also reduces the opportunity for bugs :) Ok, let's say we use ((unsigned char)((c) & 0xff)) also for __CHAR_UNSIGNED__. What should the comment say about the intended argument? I've examined all occurrences in the tree and almost always Py_CHARMASK is used with either char arguments or int arguments that are directly derived from a char, so no EOF issues arise. Exceptions: =========== Index: Objects/unicodeobject.c =================================================================== --- Objects/unicodeobject.c (revision 82192) +++ Objects/unicodeobject.c (working copy) @@ -8417,6 +8417,7 @@ else if (c >= '0' && c <= '9') { prec = c - '0'; while (--fmtcnt >= 0) { + / XXX: c and fmt are Py_UNICODE / c = Py_CHARMASK(fmt++); if (c < '0' \|\| c > '9') break; Index: Modules/_json.c =================================================================== --- Modules/_json.c (revision 82192) +++ Modules/_json.c (working copy) @@ -603,6 +603,7 @@ } } else { + / XXX: c is Py_UNICODE */ char c_char = Py_CHARMASK(c); chunk = PyString_FromStringAndSize(&c_char, 1); if (chunk == NULL) {

Antoine Pitrou <report@bugs.python.org> wrote:
> > Thus,
> > ((unsigned char)((c) & 0xff)) and ((unsigned char)(c)) should produce
> > the same results.
> 
> If it's the case, it's probably optimized away by the compiler anyway.

Yes, the asm output is the same. I was more concerned about readability.

Historically, Py_CHARMASK started as ((c) & 0xff), hence the name. In r34974
UCHAR_MAX = 2**CHAR_BIT-1 = 255 was enforced. In r61936 the cast was added.

> > There is no reason not to do the cast when __CHAR_UNSIGNED__ is
> > defined (it will be a no-op).
> 
> Agreed. It also reduces the opportunity for bugs :)

Ok, let's say we use ((unsigned char)((c) & 0xff)) also for __CHAR_UNSIGNED__.

What should the comment say about the intended argument? I've examined all
occurrences in the tree and almost always Py_CHARMASK is used with either
char arguments or int arguments that are directly derived from a char, so
no EOF issues arise.

Exceptions:
===========

Index: Objects/unicodeobject.c
===================================================================
--- Objects/unicodeobject.c     (revision 82192)
+++ Objects/unicodeobject.c     (working copy)
@@ -8417,6 +8417,7 @@
                 else if (c >= '0' && c <= '9') {
                     prec = c - '0';
                     while (--fmtcnt >= 0) {
+                        /* XXX: c and *fmt are Py_UNICODE */
                         c = Py_CHARMASK(*fmt++);
                         if (c < '0' || c > '9')
                             break;

Index: Modules/_json.c
===================================================================
--- Modules/_json.c     (revision 82192)
+++ Modules/_json.c     (working copy)
@@ -603,6 +603,7 @@
             }
         }
         else {
+            /* XXX: c is Py_UNICODE */
             char c_char = Py_CHARMASK(c);
             chunk = PyString_FromStringAndSize(&c_char, 1);
             if (chunk == NULL) {

History
Date	User	Action	Args
2010-06-24 14:47:34	skrah	set	recipients: + skrah, loewis, pitrou, eric.smith
2010-06-24 14:47:29	skrah	link	issue9036 messages
2010-06-24 14:47:28	skrah	create