Message108521
Stefan Krah wrote:
>
> Stefan Krah <stefan-usenet@bytereef.org> added the comment:
>
> Antoine Pitrou <report@bugs.python.org> wrote:
>>> Thus,
>>> ((unsigned char)((c) & 0xff)) and ((unsigned char)(c)) should produce
>>> the same results.
>>
>> If it's the case, it's probably optimized away by the compiler anyway.
>
> Yes, the asm output is the same. I was more concerned about readability.
>
> Historically, Py_CHARMASK started as ((c) & 0xff), hence the name. In r34974
> UCHAR_MAX = 2**CHAR_BIT-1 = 255 was enforced. In r61936 the cast was added.
>
>>> There is no reason not to do the cast when __CHAR_UNSIGNED__ is
>>> defined (it will be a no-op).
>>
>> Agreed. It also reduces the opportunity for bugs :)
>
> Ok, let's say we use ((unsigned char)((c) & 0xff)) also for __CHAR_UNSIGNED__.
>
> What should the comment say about the intended argument? I've examined all
> occurrences in the tree and almost always Py_CHARMASK is used with either
> char arguments or int arguments that are directly derived from a char, so
> no EOF issues arise.
Why not just make the casts in those cases explicit instead of
using Py_CHARMASK ?
> Exceptions:
> ===========
>
> Index: Objects/unicodeobject.c
> ===================================================================
> --- Objects/unicodeobject.c (revision 82192)
> +++ Objects/unicodeobject.c (working copy)
> @@ -8417,6 +8417,7 @@
> else if (c >= '0' && c <= '9') {
> prec = c - '0';
> while (--fmtcnt >= 0) {
> + /* XXX: c and *fmt are Py_UNICODE */
> c = Py_CHARMASK(*fmt++);
If both are Py_UNICODE, why do you need the cast at all ?
Note that the above use also appears to be wrong, since if *fmt
is Py_UNICODE, the following if() will also match if the higher
order bytes of the Py_UNICODE value are non-0.
> if (c < '0' || c > '9')
> break;
>
> Index: Modules/_json.c
> ===================================================================
> --- Modules/_json.c (revision 82192)
> +++ Modules/_json.c (working copy)
> @@ -603,6 +603,7 @@
> }
> }
> else {
> + /* XXX: c is Py_UNICODE */
> char c_char = Py_CHARMASK(c);
In this case you should get a compiler warning due to the
different signedness of the arguments.
Same comment as above: the high order bytes of c are lost
in this conversion.
Why can't _json.c use one of the existing Unicode conversion
APIs ?
> chunk = PyString_FromStringAndSize(&c_char, 1);
> if (chunk == NULL) {
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue9036>
> _______________________________________
> _______________________________________________
> Python-bugs-list mailing list
> Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/mal%40egenix.com |
|
Date |
User |
Action |
Args |
2010-06-24 15:25:05 | lemburg | set | recipients:
+ lemburg, loewis, pitrou, eric.smith, skrah |
2010-06-24 15:25:04 | lemburg | link | issue9036 messages |
2010-06-24 15:25:02 | lemburg | create | |
|