Issue 5377: Strange behavior when performing int on a Decimal made from -sys.maxint-1

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/49627

classification

Title:	Strange behavior when performing int on a Decimal made from -sys.maxint-1
Type:	enhancement	Stage:	test needed
Components:	Interpreter Core	Versions:	Python 2.7, Python 2.6

process

Status:	closed	Resolution:	out of date
Dependencies:		Superseder:
Assigned To:	mark.dickinson	Nosy List:	Carl.Friedrich.Bolz, debedb, mark.dickinson, rhettinger, terry.reedy, vstinner
Priority:	normal	Keywords:	patch

Created on 2009-02-26 20:24 by debedb, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
negmaxintbug.py	debedb, 2009-02-26 20:24	Test Case
force_int-4.patch	vstinner, 2009-06-08 23:42

Messages (21)
msg82773 - (view)	Author: Gregory Golberg (debedb)	Date: 2009-02-26 20:24
On some Python builds (2.5.2 and 2.6.1) the following program: import sys from decimal import Decimal def show(n): print type(n) d = Decimal(str(n)) i = int(d) t = type(i) print t i2 = int(i) t2 = type(i2) print t2 n = - sys.maxint - 1 show(n) prints <type 'int'> <type 'long'> <type 'int'> While on 2.4 and 2.5.1 it prints: <type 'int'> <type 'int'> <type 'int'> This seems to happen only with -sys.maxint-1 number! This has been tested with the following builds: * "Strange" result (with long): * 2.6.1 (r261:67515, Feb 26 2009, 12:21:28) [GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] 2.5.2 (r252:60911, Jul 31 2008, 17:28:52) [GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] 2.5.2 and 2.6.1 on Windows Server 2003 * "Expected" result (all int): * 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] 2.5.1 (r251:54863, Oct 15 2007, 13:50:22) [GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] 2.5.1 (r251:54863, Jul 31 2008, 23:17:40) [GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] 2.4.5 (#2, Aug 1 2008, 02:20:59) [GCC 4.3.1] 2.4.5 (#1, Jul 22 2008, 08:30:02) [GCC 3.4.3 (csl-sol210-3_4-20050802)] 2.4.3 (#1, Sep 21 2007, 20:05:43) [GCC 3.4.6 20060404 (Red Hat 3.4.6-8)]
msg82787 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-02-26 23:32
For a Decimal object (d), int(d) calls d.__int__(). In your example, d has the attributes: * _sign=1 (negative) * _exp=0 (10^0=1) * _int='2147483648' d.__int__() uses sint(self._int)10self._exp <=> -(int('2147483648')). Since int('2147483648') creates a long, you finally get a long instead of an integer. Workaround to get a small integer even with -2147483648: int(int(d)) ;-) For me, it's not a bug because __int__() can return a long! The following code works in Python 2.5 and 2.6: class A: def __int__(self): return 1020
msg82788 - (view)	Author: Gregory Golberg (debedb)	Date: 2009-02-26 23:38
Well, yes, the workaround works, but the question is why would the second int() return an int, if it's indeed a long? And why the difference in this behavior between 2.5.1 and 2.5.2.
msg82800 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-02-27 01:09
> the question is why would the second int() return an int, > if it's indeed a long? Python doesn't convert long to int even if the long can fit in an int. Example: >>> type(1) <type 'int'> >>> type(1L) <type 'long'> >>> type(1L+1) <type 'long'> >>> type(2) <type 'int'> Even if 1L and 2L can fit in a int, Python keeps the long type. > why the difference in this behavior between 2.5.1 and 2.5.2 No idea. You can simplify your test script with : # example with python 2.5.1 (32 bits CPU) >>> type(-int('2147483648')) <type 'long'> >>> sys.maxint On a 64 bits CPU, sys.maxint is much bigger, so don't have the problem with -2147483648 but with -9223372036854775808: # example with python 2.5.2 (64 bits CPU) >>> sys.maxint + 1 9223372036854775808L >>> -int('9223372036854775808') -9223372036854775808L >>> int(-int('9223372036854775808')) -9223372036854775808
msg82803 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-02-27 01:13
Anyway, the behaviour is correct. But ok, it's "strange" because unexpected. You have to understand the fact the long=>int conversion is manual :-/ Decimal.__int__ might force return int(result) at the end to avoid problem with -sys.maxint, but is it really important? I don't think so. Python3 doesn't have this problem ;-)
msg82829 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-02-27 11:38
Why do you care whether the result is an int or a long in this case? Does it affect any code that you know of in a meaningful way? > And why the difference in this behavior between 2.5.1 and 2.5.2. There were some fairly major changes (many bugfixes, new functions to comply with an updated specification, for example, pow, log and log10) to the decimal module between 2.5 and 2.6, and the majority of those changes were also backported to 2.5.2. This particular change was part of a set of changes that changed the internal representation of the coefficient of a Decimal instance from a tuple to a string, for speed reasons. See r59144. As Victor says, this is trivial to fix; I'm not convinced that it's actually worth fixing, though. In Python 2.5, the difference between ints and longs should be almost invisible anyway. It's nice (for performance reasons) if small integers are represented as ints rather than longs. Since this one's only just a small integer, it's difficult to care much. :-)
msg82830 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-02-27 11:50
For anyone who does care about this, it should be noted that the Fraction type has similar issues. The following comes from Python 2.7 on a 64-bit machine: >>> int(Fraction(263-1)) 9223372036854775807L >>> int(263-1) 9223372036854775807
msg82869 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2009-02-27 20:47
Unless there is a discrepancy between doc and behavior, this strikes me as an unspecified implementation detail. If so, it should be either closed or changed to a specific feature request.
msg82911 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-02-28 14:42
@tjreedy: Do you expect conversion to small int if __int__() result fits in a small int? ---- class A: def __int__(self): return 1L x=int(A()) print repr(x), type(x) ---- Result with Python 2.5.1: 1L <type 'long'>
msg82913 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-02-28 14:55
The behaviour doesn't contradict the documentation, as far as I can tell, so I agree with Terry that this is not a bug. If we want the result from the built-in int function to have type int whenever possible (that is, whenever the result is in the closed interval [-sys.maxint-1, sys.maxint], it doesn't seem right that the burden for ensuring this should lie with individual __int__ methods: instead, the general machinery for implementing the built-in int function should check any result of type long to see if it fits in an int, and convert if so. Is this desirable?
msg84231 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-03-26 23:36
> The general machinery for implementing the built-in int function > should check any result of type long to see if it fits in an int, > and convert if so. Attached patch try to convert long to int, and so it fix the intial problem: assert isinstance(int(Decimal(-sys.maxint-1), int). I used benchmark tools dedicated to test integers: Unpatched: pidigit.py: 4612.0 ms bench_int.py: 2743.5 ms Patched: pidigit.py: 4623.8 ms (0.26% slower) bench_int.py: 2754.5 ms (0.40% slower) So for intensive integer operations, the overhead is low. Using a more generic benchmark tool (pybench?), you might not be able to see the difference ;-) I'm +0 for this patch because it fixes a very rare case: 1 case on (sys.maxint + 1) × 2 0.00000002% with maxint=2^31
msg84233 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-03-26 23:39
I added the two benchmark tools to my own public SVN: http://haypo.hachoir.org/trac/browser/misc/bench_int.py (improved version of the script attached to issue #4294) http://haypo.hachoir.org/trac/browser/misc/pidigits.py (improved version of the script attached to issue #5512) If you know a better place to these benchmarks, feel free to reupload them somewhere else.
msg84297 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-03-28 04:22
Thanks for the patch, Victor. I think this is the right thing to do, though I'm still not sure why anyone would care about getting longs instead of ints back from int(x). Comments and questions: (0) Please could you add some tests! (1) Shouldn't the first line you added include a check for res == NULL? (2) It looks as though the patched code ends up calling PyLong_Check twice when __int__ returns a long. Can you find a clear rewrite that avoids this duplication? By the way, I realized after posting my last comment that the issue with Fraction has nothing to do with extreme int values. For example, with the current trunk (not including Victor's patch): >>> int(Fraction(2L)) 2L >>> int(int(Fraction(2L))) 2 I don't think should be considered a bug in Fraction---I think Victor's solution of making the int() machinery always return int when possible is the right one here. The need to call int(int(x)) if you really want an int seems a little ugly.
msg84376 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-03-29 11:38
> I'm still not sure why anyone would care about getting longs > instead of ints back from int(x) It's strange that sometimes we need to write int(int(obj)) to get an integer :-/ I usually use int(x) to convert x to an integer (type 'int' and not 'long'). > (0) Please could you add some tests! done > (1) Shouldn't the first line you added include a check > for res == NULL? segfault... ooops :-) fixed > (2) It looks as though the patched code ends up calling > PyLong_Check twice (...) done See updated patch.
msg84377 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-03-29 11:42
(oops, my patch v2 includes an unrelated change)
msg84424 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-03-29 18:48
Thanks, Victor A couple of things: - I'm getting a test failure in test_class - you should probably be using sys.maxint rather than sys.maxsize: the two aren't necessarily the same. (E.g., on 64-bit windows, I believe that sys.maxint is 231-1 while sys.maxsize is 263-1). - This still doesn't fix the case of int(Fraction(2L)).
msg89125 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-06-08 23:42
> Thanks, Victor You're welcome :-) > - I'm getting a test failure in test_class fixed > - you should probably be using sys.maxint rather than sys.maxsize done > This still doesn't fix the case of int(Fraction(2L)) fixed: Fraction uses __trunc__ rather than __int__. See updated patch: force_int-4.patch
msg93866 - (view)	Author: Carl Friedrich Bolz-Tereick (Carl.Friedrich.Bolz) *	Date: 2009-10-11 18:04
PyPy is a bit of a special case, because it cares about the distinction of int and long in the translation toolchain. Nevertheless, this behavior has been annoying to us.
msg93867 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2009-10-11 18:07
Carl, thanks for that. I was just thinking about abandoning this issue as not worth fixing. I need to look at Victor's patch again, but I recall that there were still some issues: e.g., if the __int__ method of some class returns a bool, that still ends up getting returned as a bool rather than an int. Getting everything exactly right seemed fiddly enough to make it not worth the effort. Would the bool/int distinction matter to PyPy?
msg93869 - (view)	Author: Carl Friedrich Bolz-Tereick (Carl.Friedrich.Bolz) *	Date: 2009-10-11 18:13
[...] > Would the bool/int distinction matter to PyPy? No, it's really mostly about longs and ints, because RPython does not have automatic overflowing of ints to longs (the goal is really to translate ints them to C longs with normal C overflow behaviour). I would understand if you decide for wontfix, because you are not supposed to care about int/long and as I said, PyPy is a special case. Thanks, Carl Friedrich
msg103026 - (view)	Author: Mark Dickinson (mark.dickinson) *	Date: 2010-04-13 09:19
Closing: it's too late for Python 2.x.

History
Date	User	Action	Args
2022-04-11 14:56:46	admin	set	github: 49627
2010-04-13 09:19:52	mark.dickinson	set	status: open -> closed resolution: out of date messages: + msg103026
2010-01-10 13:27:30	mark.dickinson	set	priority: low -> normal
2009-10-11 18:13:09	Carl.Friedrich.Bolz	set	messages: + msg93869 title: Strange behavior when performing int on a Decimal made from -sys.maxint-1 -> Strange behavior when performing int on a Decimal made from -sys.maxint-1
2009-10-11 18:08:00	mark.dickinson	set	messages: + msg93867
2009-10-11 18:04:28	Carl.Friedrich.Bolz	set	nosy: + Carl.Friedrich.Bolz messages: + msg93866
2009-06-08 23:42:23	vstinner	set	files: - force_int-3.patch
2009-06-08 23:42:16	vstinner	set	files: + force_int-4.patch messages: + msg89125
2009-03-29 18:48:58	mark.dickinson	set	messages: + msg84424
2009-03-29 11:42:19	vstinner	set	files: - force_int.patch
2009-03-29 11:42:14	vstinner	set	files: + force_int-3.patch messages: + msg84377
2009-03-29 11:39:17	vstinner	set	files: - force_int-2.patch
2009-03-29 11:38:42	vstinner	set	files: + force_int-2.patch messages: + msg84376
2009-03-28 12:22:59	mark.dickinson	set	components: + Interpreter Core, - Library (Lib)
2009-03-28 12:22:48	mark.dickinson	set	priority: low assignee: mark.dickinson
2009-03-28 04:22:05	mark.dickinson	set	messages: + msg84297 stage: test needed
2009-03-26 23:39:11	vstinner	set	messages: + msg84233
2009-03-26 23:36:31	vstinner	set	files: + force_int.patch keywords: + patch messages: + msg84231
2009-02-28 14:56:34	mark.dickinson	set	type: behavior -> enhancement
2009-02-28 14:56:00	mark.dickinson	set	messages: + msg82913
2009-02-28 14:42:01	vstinner	set	messages: + msg82911
2009-02-27 20:47:27	terry.reedy	set	nosy: + terry.reedy messages: + msg82869
2009-02-27 11:50:57	mark.dickinson	set	messages: + msg82830
2009-02-27 11:38:27	mark.dickinson	set	nosy: + rhettinger, mark.dickinson messages: + msg82829 components: + Library (Lib), - Interpreter Core versions: + Python 2.7, - Python 2.5
2009-02-27 07:15:51	theller	set	assignee: theller -> (no value)
2009-02-27 07:15:34	theller	set	nosy: - theller components: - ctypes
2009-02-27 01:13:39	vstinner	set	messages: + msg82803
2009-02-27 01:13:22	vstinner	set	messages: - msg82802
2009-02-27 01:12:47	vstinner	set	messages: + msg82802
2009-02-27 01:09:36	vstinner	set	messages: + msg82800
2009-02-26 23:38:39	debedb	set	messages: + msg82788
2009-02-26 23:32:55	vstinner	set	nosy: + vstinner messages: + msg82787
2009-02-26 20:24:27	debedb	create