msg158803 - (view) |
Author: Brecht Machiels (brechtm) |
Date: 2012-04-20 07:51 |
I have subclassed int to add an extra attribute:
class Integer(int):
def __new__(cls, value, base=10, indirect=False):
try:
obj = int.__new__(cls, value, base)
except TypeError:
obj = int.__new__(cls, value)
return obj
def __init__(self, value, base=10, indirect=False):
self.indirect = indirect
Using this class in my application, int(Integer(b'0')) sometimes returns a value of 48 (= ord('0')!) or 192, instead of the correct value 0. str(Integer(b'0')) always returns '0'. This seems to only occur for the value 0. First decoding b'0' to a string, or passing int(b'0') to Integer makes no difference. The problem lies with converting an Integer(0) to an int with int().
Furthermore, this occurs in a random way. Subsequent runs will produce 48 or 192 at different points in the application (a parser). Both Python 3.2.2 and 3.2.3 behave the same (32-bit, Windows XP). Apparently, the 64-bit Windows Python 3.2.3 does not show this behavior [2]. I haven't tested on other operating systems.
I cannot seem to reproduce this in a simple test program. The following produces no output:
for i in range(100000):
integer = int(Integer(b'0'))
if integer > 0:
print(integer)
Checking for the condition int(Integer()) > 0 in my application (when I know the argument to Integer is b'0') and conditionally printing int(Integer(b'0')) a number of times, the results 48 and 192 do show up now and then.
As I can't reproduce the problem in a short test program, I have attached the relevant code. It is basically a PDF parser. The output for this [2] PDF file is, for example:
b'0' 0 Integer(0) 192 0 b'0' 16853712
b'0' 0 Integer(0) 48 0 b'0' 16938088
b'0' 0 Integer(0) 192 0 b'0' 17421696
b'0' 0 Integer(0) 48 0 b'0' 23144888
b'0' 0 Integer(0) 48 0 b'0' 23185408
b'0' 0 Integer(0) 48 0 b'0' 23323272
Search for print function calls in the code to see what this represents.
[1] http://stackoverflow.com/questions/10230604/non-deterministic-behavior-of-int-subclass#comment13156508_10230604
[2] http://www.gust.org.pl/projects/e-foundry/math-support/vieth2008.pdf
|
msg158812 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-04-20 10:01 |
I can reproduce this on a 32-bit OS X build of the default branch, so it doesn't seem to be Windows specific (though it may be 32-bit specific).
Brecht, if you can find a way to reduce the size of your example at all that would be really helpful.
|
msg158814 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-04-20 10:53 |
Reproduced under 32-bit Linux.
The problem seems to be that Py_SIZE(x) == 0 when x is Integer(0), but ob_digit[0] is still supposed to be significant. There's probably some overwriting with the trailing attributes.
By forcing Py_SIZE(x) == 1, the bug disappears, but it probably breaks lots of other stuff in longobject.c.
|
msg158815 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-04-20 10:56 |
If we're accessing ob_digit[0] when Py_SIZE(x) == 0, that sounds like a bug to me.
|
msg158816 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-04-20 11:07 |
> If we're accessing ob_digit[0] when Py_SIZE(x) == 0, that sounds like a
> bug to me.
_PyLong_Copy does.
It's ok as long as the object is int(0), because it's part of the small ints and its allocated size is one digit.
The following hack seems to fix the issue here. Perhaps we can simply fix _PyLong_Copy, but I wonder how many other parts of longobject.c rely on accessing ob_digit[0].
diff --git a/Objects/longobject.c b/Objects/longobject.c
--- a/Objects/longobject.c
+++ b/Objects/longobject.c
@@ -4194,6 +4194,8 @@ long_subtype_new(PyTypeObject *type, PyO
n = Py_SIZE(tmp);
if (n < 0)
n = -n;
+ if (n == 0)
+ n = 1;
newobj = (PyLongObject *)type->tp_alloc(type, n);
if (newobj == NULL) {
Py_DECREF(tmp);
diff --git a/Objects/object.c b/Objects/object.c
--- a/Objects/object.c
+++ b/Objects/object.c
@@ -1010,6 +1010,8 @@ PyObject **
tsize = ((PyVarObject *)obj)->ob_size;
if (tsize < 0)
tsize = -tsize;
+ if (tsize == 0 && PyLong_Check(obj))
+ tsize = 1;
size = _PyObject_VAR_SIZE(tp, tsize);
dictoffset += (long)size;
@@ -1090,6 +1092,8 @@ PyObject *
tsize = ((PyVarObject *)obj)->ob_size;
if (tsize < 0)
tsize = -tsize;
+ if (tsize == 0 && PyLong_Check(obj))
+ tsize = 1;
size = _PyObject_VAR_SIZE(tp, tsize);
dictoffset += (long)size;
|
msg158817 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-04-20 11:35 |
> _PyLong_Copy does.
Grr. So it does. That at least should be fixed, but I agree that it would be good to have the added protection of ensuring that we always allocate space for at least one limb.
We should also check whether 2.7 is susceptible.
|
msg158819 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-04-20 11:53 |
Self-contained example that fails for me on 32-bit OS X.
class Integer(int):
def __new__(cls, value, base=10, indirect=False):
try:
obj = int.__new__(cls, value, base)
except TypeError:
obj = int.__new__(cls, value)
return obj
def __init__(self, value, base=10, indirect=False):
self.indirect = indirect
integers = []
for i in range(1000):
integer = Integer(b'0')
integers.append(integer)
for integer in integers:
assert int(integer) == 0
|
msg158822 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-04-20 12:06 |
The fix for _PyLong_Copy is the following:
diff --git a/Objects/longobject.c b/Objects/longobject.c
--- a/Objects/longobject.c
+++ b/Objects/longobject.c
@@ -156,7 +156,7 @@ PyObject *
if (i < 0)
i = -(i);
if (i < 2) {
- sdigit ival = src->ob_digit[0];
+ sdigit ival = (i == 0) ? 0 : src->ob_digit[0];
if (Py_SIZE(src) < 0)
ival = -ival;
CHECK_SMALL_INT(ival);
|
msg158823 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-04-20 12:18 |
Using MEDIUM_VALUE also works.
I'll cook up a patch tonight, after work.
diff -r 6762b943ee59 Objects/longobject.c
--- a/Objects/longobject.c Tue Apr 17 21:42:07 2012 -0400
+++ b/Objects/longobject.c Fri Apr 20 13:18:01 2012 +0100
@@ -156,9 +156,7 @@
if (i < 0)
i = -(i);
if (i < 2) {
- sdigit ival = src->ob_digit[0];
- if (Py_SIZE(src) < 0)
- ival = -ival;
+ sdigit ival = MEDIUM_VALUE(src);
CHECK_SMALL_INT(ival);
}
result = _PyLong_New(i);
|
msg158854 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-04-20 17:23 |
Here's the patch. I searched through the rest of Objects/longobject.c for other occurrences of [0], and found nothing else that looked suspicious, so I'm reasonably confident that this was an isolated case.
|
msg158861 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-04-20 17:52 |
Also, Python 2.7 looks safe here.
|
msg158863 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2012-04-20 17:57 |
The patch works fine here, and the test exercises the issue correctly.
|
msg158877 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2012-04-20 19:49 |
The patch looks good to me.
|
msg158886 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2012-04-20 20:44 |
New changeset cdcc6b489862 by Mark Dickinson in branch '3.2':
Issue #14630: Fix an incorrect access of ob_digit[0] for a zero instance of an int subclass.
http://hg.python.org/cpython/rev/cdcc6b489862
New changeset c7b0f711dc15 by Mark Dickinson in branch 'default':
Issue #14630: Merge fix from 3.2.
http://hg.python.org/cpython/rev/c7b0f711dc15
|
msg158888 - (view) |
Author: Mark Dickinson (mark.dickinson) * |
Date: 2012-04-20 20:45 |
Fixed. Thanks Brecht for the report (and Antoine for diagnosing the problem).
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:29 | admin | set | github: 58835 |
2012-04-20 21:02:16 | mark.dickinson | set | status: open -> closed |
2012-04-20 20:45:28 | mark.dickinson | set | resolution: fixed messages:
+ msg158888 |
2012-04-20 20:44:30 | python-dev | set | nosy:
+ python-dev messages:
+ msg158886
|
2012-04-20 19:49:51 | skrah | set | messages:
+ msg158877 |
2012-04-20 17:57:51 | pitrou | set | messages:
+ msg158863 |
2012-04-20 17:52:15 | mark.dickinson | set | stage: needs patch -> commit review |
2012-04-20 17:52:08 | mark.dickinson | set | messages:
+ msg158861 versions:
- Python 2.7 |
2012-04-20 17:24:06 | mark.dickinson | set | files:
+ issue14630.patch keywords:
+ patch |
2012-04-20 17:23:55 | mark.dickinson | set | messages:
+ msg158854 |
2012-04-20 12:18:37 | mark.dickinson | set | assignee: mark.dickinson messages:
+ msg158823 |
2012-04-20 12:06:37 | pitrou | set | components:
+ Interpreter Core, - None stage: needs patch |
2012-04-20 12:06:28 | pitrou | set | assignee: mark.dickinson -> (no value) messages:
+ msg158822 |
2012-04-20 11:53:11 | mark.dickinson | set | messages:
+ msg158819 |
2012-04-20 11:45:03 | mark.dickinson | set | assignee: mark.dickinson |
2012-04-20 11:35:59 | mark.dickinson | set | messages:
+ msg158817 versions:
+ Python 2.7 |
2012-04-20 11:07:07 | pitrou | set | messages:
+ msg158816 |
2012-04-20 10:56:45 | mark.dickinson | set | messages:
+ msg158815 |
2012-04-20 10:53:32 | pitrou | set | nosy:
+ skrah, pitrou messages:
+ msg158814
|
2012-04-20 10:01:56 | mark.dickinson | set | priority: normal -> high
messages:
+ msg158812 versions:
+ Python 3.3 |
2012-04-20 08:07:05 | mark.dickinson | set | nosy:
+ mark.dickinson
|
2012-04-20 07:51:22 | brechtm | create | |