This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Strange behavior when performing int on a Decimal made from -sys.maxint-1
Type: enhancement Stage: test needed
Components: Interpreter Core Versions: Python 2.7, Python 2.6
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: Carl.Friedrich.Bolz, debedb, mark.dickinson, rhettinger, terry.reedy, vstinner
Priority: normal Keywords: patch

Created on 2009-02-26 20:24 by debedb, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
negmaxintbug.py debedb, 2009-02-26 20:24 Test Case
force_int-4.patch vstinner, 2009-06-08 23:42
Messages (21)
msg82773 - (view) Author: Gregory Golberg (debedb) Date: 2009-02-26 20:24
On some Python builds (2.5.2 and 2.6.1) the following program:

import sys
from decimal import Decimal

def show(n):
    print type(n)
    d = Decimal(str(n))
    i = int(d)
    t = type(i)
    print t
    i2 = int(i)
    t2 = type(i2)
    print t2

n = - sys.maxint - 1
show(n)

prints

<type 'int'>
<type 'long'>
<type 'int'>

While on 2.4 and 2.5.1 it prints:

<type 'int'>
<type 'int'>
<type 'int'>

This seems to happen only with -sys.maxint-1 number!

This has been tested with the following builds:

*** "Strange" result (with long): ***

2.6.1 (r261:67515, Feb 26 2009, 12:21:28) [GCC 4.2.4 (Ubuntu
4.2.4-1ubuntu3)]

2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)]

2.5.2 (r252:60911, Jul 31 2008, 17:28:52) [GCC 4.2.3 (Ubuntu
4.2.3-2ubuntu7)]

2.5.2 and 2.6.1 on Windows Server 2003

*** "Expected" result (all int): ***

2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] 

2.5.1 (r251:54863, Oct 15 2007, 13:50:22) [GCC 3.4.6 20060404 (Red Hat
3.4.6-3)]

2.5.1 (r251:54863, Jul 31 2008, 23:17:40) [GCC 4.1.3 20070929
(prerelease) (Ubuntu 4.1.2-16ubuntu2)] 

2.4.5 (#2, Aug  1 2008, 02:20:59) [GCC 4.3.1] 

2.4.5 (#1, Jul 22 2008, 08:30:02) [GCC 3.4.3 (csl-sol210-3_4-20050802)]

2.4.3 (#1, Sep 21 2007, 20:05:43) [GCC 3.4.6 20060404 (Red Hat 3.4.6-8)]
msg82787 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-26 23:32
For a Decimal object (d), int(d) calls d.__int__(). In your example, d 
has the attributes:
* _sign=1 (negative)
* _exp=0 (10^0=1)
* _int='2147483648'

d.__int__() uses s*int(self._int)*10**self._exp 
<=> -(int('2147483648')). Since int('2147483648') creates a long, you 
finally get a long instead of an integer.

Workaround to get a small integer even with -2147483648: 
int(int(d)) ;-)

For me, it's not a bug because __int__() can return a long! The 
following code works in Python 2.5 and 2.6:
   class A:
       def __int__(self):
           return 10**20
msg82788 - (view) Author: Gregory Golberg (debedb) Date: 2009-02-26 23:38
Well, yes, the workaround works, but the question is why would the
second int() return an int, if it's indeed a long? And why the
difference in this behavior between 2.5.1 and 2.5.2.
msg82800 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-27 01:09
> the question is why would the second int() return an int, 
> if it's indeed a long?

Python doesn't convert long to int even if the long can fit in an int. 
Example:

>>> type(1)
<type 'int'>
>>> type(1L)
<type 'long'>
>>> type(1L+1)
<type 'long'>
>>> type(2)
<type 'int'>

Even if 1L and 2L can fit in a int, Python keeps the long type.

> why the difference in this behavior between 2.5.1 and 2.5.2

No idea. You can simplify your test script with :

# example with python 2.5.1 (32 bits CPU)
>>> type(-int('2147483648'))
<type 'long'>
>>> sys.maxint

On a 64 bits CPU, sys.maxint is much bigger, so don't have the problem 
with -2147483648 but with -9223372036854775808:

# example with python 2.5.2 (*64 bits CPU*)
>>> sys.maxint + 1
9223372036854775808L
>>> -int('9223372036854775808')
-9223372036854775808L
>>> int(-int('9223372036854775808'))
-9223372036854775808
msg82803 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-27 01:13
Anyway, the behaviour is correct. But ok, it's "strange" because 
unexpected. You have to understand the fact the long=>int conversion 
is manual :-/ Decimal.__int__ might force return int(result) at the 
end to avoid problem with -sys.maxint, but is it really important? I 
don't think so. Python3 doesn't have this problem ;-)
msg82829 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-02-27 11:38
Why do you care whether the result is an int or a long in this case? 
Does it affect any code that you know of in a meaningful way?

> And why the difference in this behavior between 2.5.1 and 2.5.2.

There were some fairly major changes (many bugfixes, new functions to
comply with an updated specification, for example, pow, log and log10)
to the decimal module between 2.5 and 2.6, and the majority of those
changes were also backported to 2.5.2.  This particular change was part
of a set of changes that changed the internal representation of the
coefficient of a Decimal instance from a tuple to a string, for speed
reasons.  See r59144.

As Victor says, this is trivial to fix;  I'm not convinced that it's
actually worth fixing, though.  In Python 2.5, the difference between
ints and longs should be almost invisible anyway.  It's nice (for
performance reasons) if small integers are represented as ints rather
than longs.  Since this one's only just a small integer, it's difficult
to care much.  :-)
msg82830 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-02-27 11:50
For anyone who does care about this, it should be noted that
the Fraction type has similar issues.  The following comes from Python
2.7 on a 64-bit machine:

>>> int(Fraction(2**63-1))
9223372036854775807L
>>> int(2**63-1)
9223372036854775807
msg82869 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2009-02-27 20:47
Unless there is a discrepancy between doc and behavior, this strikes me
as an unspecified implementation detail.  If so, it should be either
closed or changed to a specific feature request.
msg82911 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-02-28 14:42
@tjreedy: Do you expect conversion to small int if __int__() result 
fits in a small int?

----
class A:
    def __int__(self):
        return 1L

x=int(A())
print repr(x), type(x)
----

Result with Python 2.5.1: 1L <type 'long'>
msg82913 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-02-28 14:55
The behaviour doesn't contradict the documentation, as far as I can 
tell, 
so I agree with Terry that this is not a bug.

If we want the result from the built-in int function to have type int 
whenever possible (that is, whenever the result is in the closed 
interval 
[-sys.maxint-1, sys.maxint], it doesn't seem right that the burden for 
ensuring this should lie with individual __int__ methods:  instead, the 
general machinery for implementing the built-in int function should 
check 
any result of type long to see if it fits in an int, and convert if so.

Is this desirable?
msg84231 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-26 23:36
> The general machinery for implementing the built-in int function
> should check any result of type long to see if it fits in an int,
> and convert if so.

Attached patch try to convert long to int, and so it fix the intial 
problem: 
  assert isinstance(int(Decimal(-sys.maxint-1), int).

I used benchmark tools dedicated to test integers:

Unpatched:
  pidigit.py: 4612.0 ms
  bench_int.py: 2743.5 ms

Patched:
  pidigit.py: 4623.8 ms (0.26% slower)
  bench_int.py: 2754.5 ms (0.40% slower)

So for intensive integer operations, the overhead is low. Using a more 
generic benchmark tool (pybench?), you might not be able to see the 
difference ;-)

I'm +0 for this patch because it fixes a very rare case: 
   1 case on (sys.maxint + 1) × 2
   0.00000002% with maxint=2^31
msg84233 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-26 23:39
I added the two benchmark tools to my own public SVN:

http://haypo.hachoir.org/trac/browser/misc/bench_int.py
(improved version of the script attached to issue #4294)

http://haypo.hachoir.org/trac/browser/misc/pidigits.py
(improved version of the script attached to issue #5512)

If you know a better place to these benchmarks, feel free to reupload 
them somewhere else.
msg84297 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-03-28 04:22
Thanks for the patch, Victor.  I think this is the right thing to do, 
though I'm still not sure why anyone would care about getting longs 
instead of ints back from int(x).

Comments and questions:

(0) Please could you add some tests!
(1) Shouldn't the first line you added include a check for res == NULL?  
(2) It looks as though the patched code ends up calling PyLong_Check twice 
when __int__ returns a long.  Can you find a clear rewrite that avoids 
this duplication?

By the way, I realized after posting my last comment that the issue with 
Fraction has nothing to do with extreme int values.  For example, with the 
current trunk (not including Victor's patch):

>>> int(Fraction(2L))
2L
>>> int(int(Fraction(2L)))
2

I don't think should be considered a bug in Fraction---I think Victor's 
solution of making the int() machinery always return int when possible is 
the right one here.  The need to call int(int(x)) if you *really* want an 
int seems a little ugly.
msg84376 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-29 11:38
> I'm still not sure why anyone would care about getting longs
> instead of ints back from int(x)

It's strange that sometimes we need to write int(int(obj)) to get an
integer :-/ I usually use int(x) to convert x to an integer (type 'int'
and not 'long').

> (0) Please could you add some tests!

done

> (1) Shouldn't the first line you added include a check 
> for res == NULL?  

segfault... ooops :-) fixed

> (2) It looks as though the patched code ends up calling 
> PyLong_Check twice (...)

done

See updated patch.
msg84377 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-03-29 11:42
(oops, my patch v2 includes an unrelated change)
msg84424 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-03-29 18:48
Thanks, Victor

A couple of things:

- I'm getting a test failure in test_class
- you should probably be using sys.maxint rather than sys.maxsize:  the 
two aren't necessarily the same.  (E.g., on 64-bit windows, I believe that 
sys.maxint is 2**31-1 while sys.maxsize is 2**63-1).
- This still doesn't fix the case of int(Fraction(2L)).
msg89125 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2009-06-08 23:42
> Thanks, Victor

You're welcome :-)

> - I'm getting a test failure in test_class

fixed

> - you should probably be using sys.maxint rather than sys.maxsize

done

> This still doesn't fix the case of int(Fraction(2L))

fixed: Fraction uses __trunc__ rather than __int__.

See updated patch: force_int-4.patch
msg93866 - (view) Author: Carl Friedrich Bolz-Tereick (Carl.Friedrich.Bolz) * Date: 2009-10-11 18:04
PyPy is a bit of a special case, because it cares about the distinction
of int and long in the translation toolchain. Nevertheless, this
behavior has been annoying to us.
msg93867 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-11 18:07
Carl, thanks for that.  I was just thinking about abandoning this issue 
as not worth fixing.

I need to look at Victor's patch again, but I recall that there were 
still some issues:  e.g., if the __int__ method of some class returns a 
bool, that still ends up getting returned as a bool rather than an int.  
Getting everything exactly right seemed fiddly enough to make it not 
worth the effort.

Would the bool/int distinction matter to PyPy?
msg93869 - (view) Author: Carl Friedrich Bolz-Tereick (Carl.Friedrich.Bolz) * Date: 2009-10-11 18:13
[...]
> Would the bool/int distinction matter to PyPy?

No, it's really mostly about longs and ints, because RPython does not 
have automatic overflowing of ints to longs (the goal is really to 
translate ints them to C longs with normal C overflow behaviour). I 
would understand if you decide for wontfix, because you are not supposed 
to care about int/long and as I said, PyPy is a special case.

Thanks,

Carl Friedrich
msg103026 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-13 09:19
Closing: it's too late for Python 2.x.
History
Date User Action Args
2022-04-11 14:56:46adminsetgithub: 49627
2010-04-13 09:19:52mark.dickinsonsetstatus: open -> closed
resolution: out of date
messages: + msg103026
2010-01-10 13:27:30mark.dickinsonsetpriority: low -> normal
2009-10-11 18:13:09Carl.Friedrich.Bolzsetmessages: + msg93869
title: Strange behavior when performing int on a Decimal made from -sys.maxint-1 -> Strange behavior when performing int on a Decimal made from -sys.maxint-1
2009-10-11 18:08:00mark.dickinsonsetmessages: + msg93867
2009-10-11 18:04:28Carl.Friedrich.Bolzsetnosy: + Carl.Friedrich.Bolz
messages: + msg93866
2009-06-08 23:42:23vstinnersetfiles: - force_int-3.patch
2009-06-08 23:42:16vstinnersetfiles: + force_int-4.patch

messages: + msg89125
2009-03-29 18:48:58mark.dickinsonsetmessages: + msg84424
2009-03-29 11:42:19vstinnersetfiles: - force_int.patch
2009-03-29 11:42:14vstinnersetfiles: + force_int-3.patch

messages: + msg84377
2009-03-29 11:39:17vstinnersetfiles: - force_int-2.patch
2009-03-29 11:38:42vstinnersetfiles: + force_int-2.patch

messages: + msg84376
2009-03-28 12:22:59mark.dickinsonsetcomponents: + Interpreter Core, - Library (Lib)
2009-03-28 12:22:48mark.dickinsonsetpriority: low
assignee: mark.dickinson
2009-03-28 04:22:05mark.dickinsonsetmessages: + msg84297
stage: test needed
2009-03-26 23:39:11vstinnersetmessages: + msg84233
2009-03-26 23:36:31vstinnersetfiles: + force_int.patch
keywords: + patch
messages: + msg84231
2009-02-28 14:56:34mark.dickinsonsettype: behavior -> enhancement
2009-02-28 14:56:00mark.dickinsonsetmessages: + msg82913
2009-02-28 14:42:01vstinnersetmessages: + msg82911
2009-02-27 20:47:27terry.reedysetnosy: + terry.reedy
messages: + msg82869
2009-02-27 11:50:57mark.dickinsonsetmessages: + msg82830
2009-02-27 11:38:27mark.dickinsonsetnosy: + rhettinger, mark.dickinson
messages: + msg82829
components: + Library (Lib), - Interpreter Core
versions: + Python 2.7, - Python 2.5
2009-02-27 07:15:51thellersetassignee: theller -> (no value)
2009-02-27 07:15:34thellersetnosy: - theller
components: - ctypes
2009-02-27 01:13:39vstinnersetmessages: + msg82803
2009-02-27 01:13:22vstinnersetmessages: - msg82802
2009-02-27 01:12:47vstinnersetmessages: + msg82802
2009-02-27 01:09:36vstinnersetmessages: + msg82800
2009-02-26 23:38:39debedbsetmessages: + msg82788
2009-02-26 23:32:55vstinnersetnosy: + vstinner
messages: + msg82787
2009-02-26 20:24:27debedbcreate