round(25, 1) should return an integer, not a float #48957

dingo · 2008-12-20T16:42:36Z

BPO	4707
Nosy	@gvanrossum, @loewis, @rhettinger, @mdickinson, @orsenthil, @vstinner, @giampaolo, @devdanzin
Files	round_int_int5.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/mdickinson'
closed_at = <Date 2009-01-28.21:47:27.401>
created_at = <Date 2008-12-20.16:42:35.503>
labels = ['interpreter-core', 'type-bug', 'release-blocker']
title = 'round(25, 1) should return an integer, not a float'
updated_at = <Date 2009-01-28.21:47:27.358>
user = 'https://bugs.python.org/dingo'

bugs.python.org fields:

activity = <Date 2009-01-28.21:47:27.358>
actor = 'mark.dickinson'
assignee = 'mark.dickinson'
closed = True
closed_date = <Date 2009-01-28.21:47:27.401>
closer = 'mark.dickinson'
components = ['Interpreter Core']
creation = <Date 2008-12-20.16:42:35.503>
creator = 'dingo'
dependencies = []
files = ['12865']
hgrepos = []
issue_num = 4707
keywords = ['patch']
message_count = 39.0
messages = ['78113', '78114', '78115', '78117', '78118', '78119', '78120', '78121', '78123', '78124', '78125', '78126', '78132', '78133', '78134', '78135', '78141', '78142', '78145', '78146', '78147', '78148', '78149', '78150', '79400', '80565', '80566', '80673', '80684', '80693', '80694', '80695', '80696', '80697', '80700', '80701', '80702', '80703', '80727']
nosy_count = 11.0
nosy_names = ['gvanrossum', 'loewis', 'rhettinger', 'mark.dickinson', 'orsenthil', 'vstinner', 'giampaolo.rodola', 'ajaksu2', 'jyasskin', 'LambertDW', 'dingo']
pr_nums = []
priority = 'release blocker'
resolution = 'accepted'
stage = 'commit review'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue4707'
versions = ['Python 3.0', 'Python 3.1']

dingo · 2008-12-20T16:42:34Z

I've been playing around with the newly released Python 3.0, and I'm a
bit confused about the built-in round()-function. To sum it up in a
single example:

Python 3.0 (r30:67503, Dec  7 2008, 04:54:04)
[GCC 4.3.2] on linux2
>>> round(25, -1)
30.0

I had expected the result to be the integer 20, because:

The documentation on built-in functions says: "values are rounded to
the closest multiple of 10 to the power minus n; if two multiples are
equally close, rounding is done toward the even choice"
Both help(round) and the documentation on built-in functions claim
that, if two arguments are given, the return value will be of the same
type as the first argument.

Is this unintended behaviour, or am I missing something?

mdickinson · 2008-12-20T17:32:59Z

Looks like a bug to me. I get the expected behaviour on my machine.

Python 3.0+ (release30-maint:67878, Dec 20 2008, 17:31:44) 
[GCC 4.0.1 (Apple Inc. build 5490)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> round(25, -1)
20.0
>>> round(25.0, -1)
20.0

What system are you on?

mdickinson · 2008-12-20T18:41:04Z

Correction: it looks like two bugs to me.

I think the wrong-return-type bug is rather serious, possibly a release
blocker for 3.0.1. I'll see if I can produce a patch.

The incorrect rounding (returning 30.0 instead of 20.0) is less serious,
but still needs fixing. For a start, we should eliminate the use of
pow() in float_round, and when the second argument to round is negative
there should be a division by a power of 10 rather than a multiplication
by the reciprocal of a power of 10.

loewis · 2008-12-20T18:43:51Z

Why do you say the return type is incorrect? It should, and does, return
a float.

mdickinson · 2008-12-20T19:11:44Z

Why do you say the return type is incorrect? It should, and does,
return a float.

The documentation at:

http://docs.python.org/dev/3.0/library/functions.html#round

says, of round(x[, n]):

"""The return value is an integer if called with one argument, otherwise
of the same type as x."""

On the other hand, PEP-3141 (which I didn't check before) seems to say
that you're right: the return value should be an instance of Real.

So maybe this is just a doc bug?

I have to admit that I don't understand the motivation for round(int, n)
returning a float, given that the value will always be integral. It seems
that this makes the two-argument version of round less useful to someone
who's trying to avoid floats, perhaps working with ints and Decimals, or
ints and Rationals, or just implementing fixed-point arithmetic with
scaled ints.

But given that a behaviour change would be backwards incompatible, and the
current behaviour is supported by at least one documentation source, it
seems easiest to call this a doc bug.

mdickinson · 2008-12-20T19:14:13Z

Jeffrey, any opinion on this?

dingo · 2008-12-20T19:23:30Z

I'm using Arch Linux (32 bit) with kernel 2.6.25 and glibc 2.8 on a core
2 duo. The described behaviour occurs in a precompiled binary of Python
3.0, but also in versions I've compiled just now, which includes Python
3.0+ (release30-maint:67879)

As far as the rounding itself is concerned, round(x, n) seems to work as
documented with n>=0 on my system, while giving the same results as the
Python-2.6-version of round() for n<0.

jyasskin · 2008-12-20T19:43:59Z

When PEP-3141 says something should be Real, that includes both int and
float as possibilities (since Real is a supertype of Integral), so it's
consistent with the PEP for round(int, int) to return an int, and I
agree with Mark that round(25, -1) ought to return an int. Making that
change would be slightly backwards-incompatible, but IIRC, int is
supposed to be usable everywhere float is, so there should be very few
programs it would break. So my vote's to change it, for 3.0.1 if possible.

The documentation is a little too strict on __round__ implementations by
requiring round(x, y) to return the same type as x, but I think it
should still encourage __round__ to return the same type.

And, uh, oops for writing those bugs. Is your guess for round(25.0,
-1)==30.0 that 25.0*(1/10.0) is slightly more than 2.5 on some systems?

rhettinger · 2008-12-20T19:57:42Z

The code that hands-off long.__round__ to float.__round__ is a train
wreck. The intended result can be computed directly and exactly for all
sizes of long.

mdickinson · 2008-12-20T20:07:03Z

Is your guess for round(25.0,
-1)==30.0 that 25.0*(1/10.0) is slightly more than 2.5 on some
systems?

Yes, something like that. I don't think it's feasible to make round
perfectly correct (for floats) in all cases without implementing some
amount of multiple precision code. But I think we can and should
rewrite the code in such a way that it has a pretty good chance of
returning correct results in common cases. Small powers of 10 can be
computed exactly (up to 10**22, I think), in contrast to reciprocals of
powers of 10.

I'll work up a patch for round(int, int) -> int, so that at least we
have
the option of fixing this for 3.0.1 *if* there's general agreement that
that's the right way to go. Seems like a python-dev discussion might be
necessary to determine that.

loewis · 2008-12-20T20:41:56Z

When PEP-3141 says something should be Real, that includes both int and
float as possibilities (since Real is a supertype of Integral), so it's
consistent with the PEP for round(int, int) to return an int, and I
agree with Mark that round(25, -1) ought to return an int.

In that case, the doc string should be fixed:

round(number[, ndigits]) -> floating point number

unless "floating point number" is supposed to include type int
(which I would find fairly confusing).

Wrt. this issue: PEP-3141 specified that round() rounds to even
for floats, leaving it explicitly unspecified how ints get rounded.

If the result is to be an int, the implementation must not go
through floats. It's a problem not only in the border cases,
but also for large numbers (which can't get represented correctly
even remotely):

py> int(round(10**20+324678,-3))
100000000000000327680
py> int(round(324678,-3))
325000

rhettinger · 2008-12-20T20:52:58Z

IMO, having long.__round__ return a float is a landmine, guaranteed to
cause someone problems someday (problems that are hard to find).

mdickinson · 2008-12-20T23:39:18Z

Here's a first attempt at a patch for round(int, int) -> int.

loewis · 2008-12-21T00:45:11Z

Correct me if I'm wrong, but I think computing the quotient is not
necessary; the remainder is sufficient:

def round(n, i):
    if i >= 0: return n
    i = 10**(-i)
    r = n%(2*i)
    if r < i:
        return n-r
    return n-r+2*i

rhettinger · 2008-12-21T01:09:06Z

Martin, that gives some answers like round(51, -2) --> 0 instead of 100.

rhettinger · 2008-12-21T01:27:49Z

Will review the patch later this evening. Thanks for the submission.

loewis · 2008-12-21T09:30:54Z

Martin, that gives some answers like round(51, -2) --> 0 instead of 100.

I see. Here is a version that fixes that.

def round(n, i):
    i = 10**(-i)
    r = n%(2*i)
    o = i/2
    n -= r
    if r <= o:
        return n
    elif r < 3*o:
        return n+i
    else:
        return n+2*i

However, I now see that it is pointless not to use divrem, since
% computes the quotient as a side effect.

mdickinson · 2008-12-21T11:05:52Z

Updated patch: fix test_builtin. (Rest of the patch unchanged.)

devdanzin · 2008-12-21T13:47:35Z

Hi Mark,
I think there's an overflow for ndigits that predates your patch:
>>> round(2, -2**31 +1)
2
>>> round(2, -2**31 +2)
nan
(it looks like these lines above make 2.6 hang :/)

Now, I'm getting a segfault in 3.0 when Ctrl + C-ing during a long
running round:

Python 3.1a0 (py3k:67893M, Dec 21 2008, 10:38:30)
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> round(2, -2**31 + 1)
2
>>> round(2, -2**31 + 2) # Press Ctrl + C
Segmentation fault
(backtrace below)

Also, maybe "round(2, -2**31 + 2)" taking long is a bug of its own?

The crash with backtrace:
Starting program: /home/ajaksu/py3k/python
[Thread debugging using libthread_db enabled]
Python 3.1a0 (py3k:67893M, Dec 21 2008, 11:08:29)
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
[New Thread 0xb7d2e8c0 (LWP 14925)]
>>> round(2, -2**31 + 1)
2
>>>
>>> round(2, -2**31 + 2) # Press Ctrl + C

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb7d2e8c0 (LWP 14925)]
_PyUnicode_New (length=10) at Objects/unicodeobject.c:359
359 unicode->str[0] = 0;
(gdb) bt
#0 _PyUnicode_New (length=10) at Objects/unicodeobject.c:359
#1 0x080708a5 in PyUnicodeUCS2_DecodeUTF8Stateful (s=0x813d8dc
"last_value", size=10, errors=0x0, consumed=0x0)
at Objects/unicodeobject.c:2022
#2 0x08072e22 in PyUnicodeUCS2_FromStringAndSize (u=0x813d8dc
"last_value", size=10) at Objects/unicodeobject.c:2000
#3 0x08072f82 in PyUnicodeUCS2_FromString (u=0x813d8dc "last_value") at
Objects/unicodeobject.c:557
#4 0x0810ddf7 in PyDict_SetItemString (v=0xb7b21714, key=0x813d8dc
"last_value", item=0xb7a4e43c)
at Objects/dictobject.c:2088
#5 0x080b5fb1 in PySys_SetObject (name=0x813d8dc "last_value", v=0xa)
at Python/sysmodule.c:67
#6 0x080afedb in PyErr_PrintEx (set_sys_last_vars=1) at
Python/pythonrun.c:1294
#7 0x080b063c in PyRun_InteractiveOneFlags (fp=0xb7e7a440,
filename=0x813f509 "<stdin>", flags=0xbf84bd34)
at Python/pythonrun.c:1189
#8 0x080b0816 in PyRun_InteractiveLoopFlags (fp=0xb7e7a440,
filename=0x813f509 "<stdin>", flags=0xbf84bd34)
at Python/pythonrun.c:909
#9 0x080b0fa2 in PyRun_AnyFileExFlags (fp=0xb7e7a440,
filename=0x813f509 "<stdin>", closeit=0, flags=0xbf84bd34)
at Python/pythonrun.c:878
#10 0x080bc49a in Py_Main (argc=0, argv=0x8192008) at Modules/main.c:611
#11 0x0805a1dc in main (argc=1, argv=0xbf84de24) at ./Modules/python.c:70

I hope this helps :)
Daniel

mdickinson · 2008-12-21T14:37:11Z

>>> round(2, -2**31 + 2) # Press Ctrl + C
Segmentation fault
(backtrace below)

Thanks, Daniel. It looks like I messed up the refcounting in the error-
handling section of the code. I'll fix this.

I don't think the hang itself should be considered a bug, any more
than the hang from "10**(2**31-1)" is a bug.

mdickinson · 2008-12-21T14:55:50Z

Cause of segfault was doing Py_XDECREF on a pointer that hadn't been
initialised to NULL.

Here's a fixed patch.

I still get the instant result:

>>> round(2, -2**31+1)
2

which is a little odd. It's the correct result, but I can't see how
it gets there: under the current algorithm, there should be a
10**(2**31-1) happening somewhere, and that would take a *lot* of time
and memory. Will investigate.

mdickinson · 2008-12-21T15:28:13Z

Aha. The special result for round(x, 1-2**31) has nothing to do with this
particular patch. It's a consequence of:

#define UNDEF_NDIGITS (-0x7fffffff) /* Unlikely ndigits value */

in bltinmodule.c. The same behaviour results for all types, not just
ints.

devdanzin · 2008-12-21T15:44:05Z

Mark Dickinson <report@bugs.python.org> wrote:

I don't think the hang itself should be considered a bug, any more
than the hang from "10**(2**31-1)" is a bug.

Well, besides the fact that you can stop "10**(2**31-1)" with Ctrl+C but
not round(2, -2**31+1), the round case may special case ndigits > number
to avoid the slow pow(10, x).

>>> round(2, -2**31+1)
2

which is a little odd. It's the correct result, but I can't see how

Is it correct? The answer for 0 > ndigits > -2**301+1 was nan before the
patch, 0 after. Given that "round(2, 2**31)" throws an OverflowError,
iff this is wrong, should it OverflowError too?

it gets there: under the current algorithm, there should be a
10**(2**31-1) happening somewhere, and that would take a *lot* of time
and memory. Will investigate.

That should be optimizable for ndigits > number, and perhaps
log10(number) < k * ndigits (for large ndigits), right? But I don't
think it's a realworld usecase, so dropping this idea for 2.6.

Aha. The special result for round(x, 1-2**31) has nothing to do
with this particular patch. It's a consequence of:

Yep, "predates your patch" as I said :)

mdickinson · 2008-12-21T16:14:19Z

[Me]
> which is a little odd. It's the correct result, but I can't see how
[Daniel]
Is it correct?

No. :-) It should be 0, as you say.

Given that "round(2, 2**31)" throws an OverflowError

I think this is wrong, too. It should be 2. It's another consequence of
the code in bltinmodule.c. The builtin_round function seems
unnecessarily complicated: it converts the second argument from a
Python object to a C int, then converts it back again before calling the
appropriate __round__ method. Then the first thing the __round__ method
typically does for built-in types is to convert to a C int again. As
far as I can tell the first two conversions are unnecessary.

Here's an updated version of the patch that does away with the
unnecessary conversions, along with the UNDEF_NDIGITS hack. All tests
still pass, on my machine, and with this patch I get the results I'd
expect:

>>> round(2, 2**31)
2
>>> round(2, 2**100)
2
>>> round(2, -2**100)
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyboardInterrupt
>> round(2, 1-2**31)
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyboardInterrupt

That should be optimizable for ndigits > number, and perhaps
log10(number) < k * ndigits (for large ndigits), right? But I don't
think it's a realworld usecase, so dropping this idea for 2.6.

Agreed. I don't think this optimization is worth it.

orsenthil · 2009-01-08T06:17:44Z

I also think that documentation should be improved for
round(number, [ndigits]). The doc/help says, ndigits can be negative,
but does not really say what it means to be negative.

It can be confusing to anyione. Take the following example:
>>> round(26.5,1)
26.5
>>> round(26.5,-1)
30.0

mdickinson · 2009-01-26T10:25:05Z

Clearer title.

mdickinson · 2009-01-26T11:08:13Z

Some minor modifications to the last patch:

fix round docstring: it now reads "round(number[, ndigits]) -> number"
instead of "round(number[, ndigits]) -> floating-point number"
add Misc/NEWS entry
add extra tests for round(x, n) with n huge and positive

rhettinger · 2009-01-27T22:31:51Z

Someone ought to ping Guido on this. He may have some strong thoughts
on the signature (having it sometimes return ints and sometimes floats).
But in this case, accuracy may be the more important consideration.

rhettinger · 2009-01-28T00:52:01Z

Guido, do you have any thoughts on this?

Basically, it's a choice between giving round() a weird signature
(sometimes returning floats and sometimes int/longs) versus having
accurate roundings of integers (which become unrepresentable when
divided by powers of 10.0).

gvanrossum · 2009-01-28T03:39:26Z

I think it's fine if it returns an int iff the first arg is an int. In
other languages this would be overloaded as follows:

round(int)int
round(float)float
round(int, int)int
round(float, int)float

lambertdw · 2009-01-28T03:45:22Z

I'd prefer round(x,positive_integer) return float. Returning int is a
bit of a lie, except that the decimal module is available to avoid this
sort of lie.

For non-positive integer roundings I'd like an integer return.

In my opinion, we'd benefit from this definition of round:

import numbers

def round(a,p=0,base=10):
    '''
        >>> round(147,-1,5)
        145.0
        >>> round(143,-1,5)
        145.0
        >>> round(142,-1,5)
        140.0
        >>> round(12.345,1,2)
        12.5
        >>> round(12.345,2,2)
        12.25
        >>> round(12.345,-2,2)
        12
        >>> round(12.345,-3,2)
        16
    '''
    # consider using sign transfer for negative a

    if base < 1:
        raise ValueError('base too confusing')

    require_integral_output = (
        (p < 1) and
        isinstance(base, numbers.Integral) and
        isinstance(p, numbers.Integral))

    b = base**p
    result = int(a*b+0.5)/b
    if require_integral_output:
        result = int(0.5+result)
    return result

gvanrossum · 2009-01-28T03:56:14Z

Well, that would leave a function whose return *type* depends on the
*value* of one of the arguments, which I find very ugly.

I don't know why you think rounding an int to X (X>0) digits after the
decimal point should return a float -- it's not like the int has any
digits behind the point, so nothing is gained. Why would it be a lie?

lambertdw · 2009-01-28T04:19:52Z

The value of one of the arguments controls how many digits there are.

Certainly if you rounded $10 to the nearest cents you'd expect $10.00.
Thus round(10,2) should be 10.00. Without using decimal module, the
best we can do is produce 10.0.

I'd apply a similar argument to convince you that the return value
should be integral for negative "number of digits".

Hark! This is python. We can take this correct and beautiful approach.
We are not constrainded by function signatures of c++ or FORTRAN90.

gvanrossum · 2009-01-28T04:26:23Z

I'm sorry, but I don't see a significant difference between $10 and
$10.00. If you want to display a certain number of digits, use "%.2f" %
x. And trust me, I'm not doing this for Fortran's sake.

mdickinson · 2009-01-28T09:26:32Z

For myself, I strongly prefer that round(int, int) return an integer in
Python 3.x. Here ar\
e three reasons:

(1) Interoperability with Decimal (and, to a lesser extent, Fraction):
the decimal module is\
carefully designed so that Decimals interact well with integers. If
round(int, int) return\
s an integer then someone working with Decimals can use round freely,
without worrying about\
whether his or her numbers are actually integers or Decimals. If
round(int, int) returns a\
float then that someone is likely to get a nasty shock doing
round(Decimal, int) when it tu\
rns out that the Decimal was actually just a plain integer.

(2) Accuracy: currently, for example, we have

>>> round(10**16+1, 0)
10000000000000000.0

If the return type is changed to int, there's no need for this loss of
accuracy.

(3) Consistency: in 3.x, round(my_decimal, n) returns a Decimal;
round(my_fraction, n) ret\
urns a Fraction. That is, for all Python's numeric types other than
int, the value returned\
from two-argument round has the same type as the first argument.

vstinner · 2009-01-28T09:35:45Z

marketdickinson> (2) Accuracy

I see int/float like bytes/characters: it's not a good idea to mix
them. If you use float, you know that you loose precision (digits)
after some operations. Whereas I suppose the operations on int are
always exact.

I like round(int,int)->int (same input/output types). To get a float,
use round(float(int), int)->float.

mdickinson · 2009-01-28T09:40:19Z

Apologies for the poor formatting in the last comment. Bad
cut-and-paste job.

One more reason:

(4) "In the face of ambiguity, refuse the temptation to guess."
Why should round(int, int) be float, rather than Decimal, or Fraction?
This was the one argument against the integer division change that I
found somewhat compelling. But of course there's a big difference: 1/2
had to have *some* type, and it couldn't be an integer. In contrast,
given that round(34, n) is always integral, there's an obvious choice
for the return type.

I'll shut up now.

rhettinger · 2009-01-28T09:56:08Z

It's settled then. Input type dictates output type. No dependency on
the actual values.

mdickinson · 2009-01-28T21:47:26Z

Committed in r69068 (py3k), r69069 (release30-maint).

The original bug with round(float, n) (loss of accuracy arising from
intermediate floating-point rounding errors) is still present; I think
further discussion for that can go into bpo-1869 (which should probably
have its priority upgraded).

dingo mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Dec 20, 2008

rhettinger added the release-blocker label Dec 20, 2008

rhettinger self-assigned this Dec 21, 2008

rhettinger removed their assignment Dec 22, 2008

mdickinson changed the title ~~round() shows undocumented behaviour~~ round(25, 1) should return an integer, not a float Jan 26, 2009

rhettinger assigned gvanrossum Jan 28, 2009

gvanrossum removed their assignment Jan 28, 2009

rhettinger assigned mdickinson Jan 28, 2009

mdickinson closed this as completed Jan 28, 2009

ezio-melotti transferred this issue from another repository Apr 10, 2022

round(25, 1) should return an integer, not a float #48957

round(25, 1) should return an integer, not a float #48957

Comments

dingo mannequin commented Dec 20, 2008

dingo mannequin commented Dec 20, 2008

mdickinson commented Dec 20, 2008

mdickinson commented Dec 20, 2008

loewis mannequin commented Dec 20, 2008

mdickinson commented Dec 20, 2008

mdickinson commented Dec 20, 2008

dingo mannequin commented Dec 20, 2008

jyasskin mannequin commented Dec 20, 2008

rhettinger commented Dec 20, 2008

mdickinson commented Dec 20, 2008

loewis mannequin commented Dec 20, 2008

rhettinger commented Dec 20, 2008

mdickinson commented Dec 20, 2008

loewis mannequin commented Dec 21, 2008

rhettinger commented Dec 21, 2008

rhettinger commented Dec 21, 2008

loewis mannequin commented Dec 21, 2008

mdickinson commented Dec 21, 2008

devdanzin mannequin commented Dec 21, 2008

mdickinson commented Dec 21, 2008

mdickinson commented Dec 21, 2008

mdickinson commented Dec 21, 2008

devdanzin mannequin commented Dec 21, 2008

mdickinson commented Dec 21, 2008

orsenthil commented Jan 8, 2009

mdickinson commented Jan 26, 2009

mdickinson commented Jan 26, 2009

rhettinger commented Jan 27, 2009

rhettinger commented Jan 28, 2009

gvanrossum commented Jan 28, 2009

lambertdw mannequin commented Jan 28, 2009

gvanrossum commented Jan 28, 2009

lambertdw mannequin commented Jan 28, 2009

gvanrossum commented Jan 28, 2009

mdickinson commented Jan 28, 2009

vstinner commented Jan 28, 2009

mdickinson commented Jan 28, 2009

rhettinger commented Jan 28, 2009

mdickinson commented Jan 28, 2009