Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

round(25, 1) should return an integer, not a float #48957

Closed
dingo mannequin opened this issue Dec 20, 2008 · 39 comments
Closed

round(25, 1) should return an integer, not a float #48957

dingo mannequin opened this issue Dec 20, 2008 · 39 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-bug An unexpected behavior, bug, or error

Comments

@dingo
Copy link
Mannequin

dingo mannequin commented Dec 20, 2008

BPO 4707
Nosy @gvanrossum, @loewis, @rhettinger, @mdickinson, @orsenthil, @vstinner, @giampaolo, @devdanzin
Files
  • round_int_int5.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/mdickinson'
    closed_at = <Date 2009-01-28.21:47:27.401>
    created_at = <Date 2008-12-20.16:42:35.503>
    labels = ['interpreter-core', 'type-bug', 'release-blocker']
    title = 'round(25, 1) should return an integer, not a float'
    updated_at = <Date 2009-01-28.21:47:27.358>
    user = 'https://bugs.python.org/dingo'

    bugs.python.org fields:

    activity = <Date 2009-01-28.21:47:27.358>
    actor = 'mark.dickinson'
    assignee = 'mark.dickinson'
    closed = True
    closed_date = <Date 2009-01-28.21:47:27.401>
    closer = 'mark.dickinson'
    components = ['Interpreter Core']
    creation = <Date 2008-12-20.16:42:35.503>
    creator = 'dingo'
    dependencies = []
    files = ['12865']
    hgrepos = []
    issue_num = 4707
    keywords = ['patch']
    message_count = 39.0
    messages = ['78113', '78114', '78115', '78117', '78118', '78119', '78120', '78121', '78123', '78124', '78125', '78126', '78132', '78133', '78134', '78135', '78141', '78142', '78145', '78146', '78147', '78148', '78149', '78150', '79400', '80565', '80566', '80673', '80684', '80693', '80694', '80695', '80696', '80697', '80700', '80701', '80702', '80703', '80727']
    nosy_count = 11.0
    nosy_names = ['gvanrossum', 'loewis', 'rhettinger', 'mark.dickinson', 'orsenthil', 'vstinner', 'giampaolo.rodola', 'ajaksu2', 'jyasskin', 'LambertDW', 'dingo']
    pr_nums = []
    priority = 'release blocker'
    resolution = 'accepted'
    stage = 'commit review'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue4707'
    versions = ['Python 3.0', 'Python 3.1']

    @dingo
    Copy link
    Mannequin Author

    dingo mannequin commented Dec 20, 2008

    I've been playing around with the newly released Python 3.0, and I'm a
    bit confused about the built-in round()-function. To sum it up in a
    single example:

    Python 3.0 (r30:67503, Dec  7 2008, 04:54:04)
    [GCC 4.3.2] on linux2
    >>> round(25, -1)
    30.0

    I had expected the result to be the integer 20, because:

    1. The documentation on built-in functions says: "values are rounded to
      the closest multiple of 10 to the power minus n; if two multiples are
      equally close, rounding is done toward the even choice"

    2. Both help(round) and the documentation on built-in functions claim
      that, if two arguments are given, the return value will be of the same
      type as the first argument.

    Is this unintended behaviour, or am I missing something?

    @dingo dingo mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Dec 20, 2008
    @mdickinson
    Copy link
    Member

    Looks like a bug to me. I get the expected behaviour on my machine.

    Python 3.0+ (release30-maint:67878, Dec 20 2008, 17:31:44) 
    [GCC 4.0.1 (Apple Inc. build 5490)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> round(25, -1)
    20.0
    >>> round(25.0, -1)
    20.0

    What system are you on?

    @mdickinson
    Copy link
    Member

    Correction: it looks like two bugs to me.

    I think the wrong-return-type bug is rather serious, possibly a release
    blocker for 3.0.1. I'll see if I can produce a patch.

    The incorrect rounding (returning 30.0 instead of 20.0) is less serious,
    but still needs fixing. For a start, we should eliminate the use of
    pow() in float_round, and when the second argument to round is negative
    there should be a division by a power of 10 rather than a multiplication
    by the reciprocal of a power of 10.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Dec 20, 2008

    Why do you say the return type is incorrect? It should, and does, return
    a float.

    @mdickinson
    Copy link
    Member

    Why do you say the return type is incorrect? It should, and does,
    return a float.

    The documentation at:

    http://docs.python.org/dev/3.0/library/functions.html#round

    says, of round(x[, n]):

    """The return value is an integer if called with one argument, otherwise
    of the same type as x."""

    On the other hand, PEP-3141 (which I didn't check before) seems to say
    that you're right: the return value should be an instance of Real.

    So maybe this is just a doc bug?

    I have to admit that I don't understand the motivation for round(int, n)
    returning a float, given that the value will always be integral. It seems
    that this makes the two-argument version of round less useful to someone
    who's trying to avoid floats, perhaps working with ints and Decimals, or
    ints and Rationals, or just implementing fixed-point arithmetic with
    scaled ints.

    But given that a behaviour change would be backwards incompatible, and the
    current behaviour is supported by at least one documentation source, it
    seems easiest to call this a doc bug.

    @mdickinson
    Copy link
    Member

    Jeffrey, any opinion on this?

    @dingo
    Copy link
    Mannequin Author

    dingo mannequin commented Dec 20, 2008

    I'm using Arch Linux (32 bit) with kernel 2.6.25 and glibc 2.8 on a core
    2 duo. The described behaviour occurs in a precompiled binary of Python
    3.0, but also in versions I've compiled just now, which includes Python
    3.0+ (release30-maint:67879)

    As far as the rounding itself is concerned, round(x, n) seems to work as
    documented with n>=0 on my system, while giving the same results as the
    Python-2.6-version of round() for n<0.

    @jyasskin
    Copy link
    Mannequin

    jyasskin mannequin commented Dec 20, 2008

    When PEP-3141 says something should be Real, that includes both int and
    float as possibilities (since Real is a supertype of Integral), so it's
    consistent with the PEP for round(int, int) to return an int, and I
    agree with Mark that round(25, -1) ought to return an int. Making that
    change would be slightly backwards-incompatible, but IIRC, int is
    supposed to be usable everywhere float is, so there should be very few
    programs it would break. So my vote's to change it, for 3.0.1 if possible.

    The documentation is a little too strict on __round__ implementations by
    requiring round(x, y) to return the same type as x, but I think it
    should still encourage __round__ to return the same type.

    And, uh, oops for writing those bugs. Is your guess for round(25.0,
    -1)==30.0 that 25.0*(1/10.0) is slightly more than 2.5 on some systems?

    @rhettinger
    Copy link
    Contributor

    The code that hands-off long.__round__ to float.__round__ is a train
    wreck. The intended result can be computed directly and exactly for all
    sizes of long.

    @mdickinson
    Copy link
    Member

    Is your guess for round(25.0,
    -1)==30.0 that 25.0*(1/10.0) is slightly more than 2.5 on some
    systems?

    Yes, something like that. I don't think it's feasible to make round
    perfectly correct (for floats) in all cases without implementing some
    amount of multiple precision code. But I think we can and should
    rewrite the code in such a way that it has a pretty good chance of
    returning correct results in common cases. Small powers of 10 can be
    computed exactly (up to 10**22, I think), in contrast to reciprocals of
    powers of 10.

    I'll work up a patch for round(int, int) -> int, so that at least we
    have
    the option of fixing this for 3.0.1 *if* there's general agreement that
    that's the right way to go. Seems like a python-dev discussion might be
    necessary to determine that.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Dec 20, 2008

    When PEP-3141 says something should be Real, that includes both int and
    float as possibilities (since Real is a supertype of Integral), so it's
    consistent with the PEP for round(int, int) to return an int, and I
    agree with Mark that round(25, -1) ought to return an int.

    In that case, the doc string should be fixed:

    round(number[, ndigits]) -> floating point number

    unless "floating point number" is supposed to include type int
    (which I would find fairly confusing).

    Wrt. this issue: PEP-3141 specified that round() rounds to even
    for floats, leaving it explicitly unspecified how ints get rounded.

    If the result is to be an int, the implementation must not go
    through floats. It's a problem not only in the border cases,
    but also for large numbers (which can't get represented correctly
    even remotely):

    py> int(round(10**20+324678,-3))
    100000000000000327680
    py> int(round(324678,-3))
    325000

    @rhettinger
    Copy link
    Contributor

    IMO, having long.__round__ return a float is a landmine, guaranteed to
    cause someone problems someday (problems that are hard to find).

    @mdickinson
    Copy link
    Member

    Here's a first attempt at a patch for round(int, int) -> int.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Dec 21, 2008

    Correct me if I'm wrong, but I think computing the quotient is not
    necessary; the remainder is sufficient:

    def round(n, i):
        if i >= 0: return n
        i = 10**(-i)
        r = n%(2*i)
        if r < i:
            return n-r
        return n-r+2*i

    @rhettinger
    Copy link
    Contributor

    Martin, that gives some answers like round(51, -2) --> 0 instead of 100.

    @rhettinger rhettinger self-assigned this Dec 21, 2008
    @rhettinger
    Copy link
    Contributor

    Will review the patch later this evening. Thanks for the submission.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Dec 21, 2008

    Martin, that gives some answers like round(51, -2) --> 0 instead of 100.

    I see. Here is a version that fixes that.

    def round(n, i):
        i = 10**(-i)
        r = n%(2*i)
        o = i/2
        n -= r
        if r <= o:
            return n
        elif r < 3*o:
            return n+i
        else:
            return n+2*i

    However, I now see that it is pointless not to use divrem, since
    % computes the quotient as a side effect.

    @mdickinson
    Copy link
    Member

    Updated patch: fix test_builtin. (Rest of the patch unchanged.)

    @devdanzin
    Copy link
    Mannequin

    devdanzin mannequin commented Dec 21, 2008

    Hi Mark,
    I think there's an overflow for ndigits that predates your patch:
    >>> round(2, -2**31 +1)
    2
    >>> round(2, -2**31 +2)
    nan
    (it looks like these lines above make 2.6 hang :/)

    Now, I'm getting a segfault in 3.0 when Ctrl + C-ing during a long
    running round:

    Python 3.1a0 (py3k:67893M, Dec 21 2008, 10:38:30)
    [GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> round(2, -2**31 + 1)
    2
    >>> round(2, -2**31 + 2) # Press Ctrl + C
    Segmentation fault
    (backtrace below)

    Also, maybe "round(2, -2**31 + 2)" taking long is a bug of its own?

    The crash with backtrace:
    Starting program: /home/ajaksu/py3k/python
    [Thread debugging using libthread_db enabled]
    Python 3.1a0 (py3k:67893M, Dec 21 2008, 11:08:29)
    [GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    [New Thread 0xb7d2e8c0 (LWP 14925)]
    >>> round(2, -2**31 + 1)
    2
    >>>
    >>> round(2, -2**31 + 2) # Press Ctrl + C

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0xb7d2e8c0 (LWP 14925)]
    _PyUnicode_New (length=10) at Objects/unicodeobject.c:359
    359 unicode->str[0] = 0;
    (gdb) bt
    #0 _PyUnicode_New (length=10) at Objects/unicodeobject.c:359
    #1 0x080708a5 in PyUnicodeUCS2_DecodeUTF8Stateful (s=0x813d8dc
    "last_value", size=10, errors=0x0, consumed=0x0)
    at Objects/unicodeobject.c:2022
    #2 0x08072e22 in PyUnicodeUCS2_FromStringAndSize (u=0x813d8dc
    "last_value", size=10) at Objects/unicodeobject.c:2000
    #3 0x08072f82 in PyUnicodeUCS2_FromString (u=0x813d8dc "last_value") at
    Objects/unicodeobject.c:557
    #4 0x0810ddf7 in PyDict_SetItemString (v=0xb7b21714, key=0x813d8dc
    "last_value", item=0xb7a4e43c)
    at Objects/dictobject.c:2088
    #5 0x080b5fb1 in PySys_SetObject (name=0x813d8dc "last_value", v=0xa)
    at Python/sysmodule.c:67
    #6 0x080afedb in PyErr_PrintEx (set_sys_last_vars=1) at
    Python/pythonrun.c:1294
    #7 0x080b063c in PyRun_InteractiveOneFlags (fp=0xb7e7a440,
    filename=0x813f509 "<stdin>", flags=0xbf84bd34)
    at Python/pythonrun.c:1189
    #8 0x080b0816 in PyRun_InteractiveLoopFlags (fp=0xb7e7a440,
    filename=0x813f509 "<stdin>", flags=0xbf84bd34)
    at Python/pythonrun.c:909
    #9 0x080b0fa2 in PyRun_AnyFileExFlags (fp=0xb7e7a440,
    filename=0x813f509 "<stdin>", closeit=0, flags=0xbf84bd34)
    at Python/pythonrun.c:878
    #10 0x080bc49a in Py_Main (argc=0, argv=0x8192008) at Modules/main.c:611
    #11 0x0805a1dc in main (argc=1, argv=0xbf84de24) at ./Modules/python.c:70

    I hope this helps :)
    Daniel

    @mdickinson
    Copy link
    Member

    >>> round(2, -2**31 + 2) # Press Ctrl + C
    Segmentation fault
    (backtrace below)

    Thanks, Daniel. It looks like I messed up the refcounting in the error-
    handling section of the code. I'll fix this.

    I don't think the hang itself should be considered a bug, any more
    than the hang from "10**(2**31-1)" is a bug.

    @mdickinson
    Copy link
    Member

    Cause of segfault was doing Py_XDECREF on a pointer that hadn't been
    initialised to NULL.

    Here's a fixed patch.

    I still get the instant result:

    >>> round(2, -2**31+1)
    2

    which is a little odd. It's the correct result, but I can't see how
    it gets there: under the current algorithm, there should be a
    10**(2**31-1) happening somewhere, and that would take a *lot* of time
    and memory. Will investigate.

    @mdickinson
    Copy link
    Member

    Aha. The special result for round(x, 1-2**31) has nothing to do with this
    particular patch. It's a consequence of:

    #define UNDEF_NDIGITS (-0x7fffffff) /* Unlikely ndigits value */

    in bltinmodule.c. The same behaviour results for all types, not just
    ints.

    @devdanzin
    Copy link
    Mannequin

    devdanzin mannequin commented Dec 21, 2008

    Mark Dickinson <report@bugs.python.org> wrote:

    I don't think the hang itself should be considered a bug, any more
    than the hang from "10**(2**31-1)" is a bug.

    Well, besides the fact that you can stop "10**(2**31-1)" with Ctrl+C but
    not round(2, -2**31+1), the round case may special case ndigits > number
    to avoid the slow pow(10, x).

    >>> round(2, -2**31+1)
    2

    which is a little odd. It's the correct result, but I can't see how

    Is it correct? The answer for 0 > ndigits > -2**301+1 was nan before the
    patch, 0 after. Given that "round(2, 2**31)" throws an OverflowError,
    iff this is wrong, should it OverflowError too?

    it gets there: under the current algorithm, there should be a
    10**(2**31-1) happening somewhere, and that would take a *lot* of time
    and memory. Will investigate.

    That should be optimizable for ndigits > number, and perhaps
    log10(number) < k * ndigits (for large ndigits), right? But I don't
    think it's a realworld usecase, so dropping this idea for 2.6.

    Aha. The special result for round(x, 1-2**31) has nothing to do
    with this particular patch. It's a consequence of:

    Yep, "predates your patch" as I said :)

    @mdickinson
    Copy link
    Member

    [Me]
    > which is a little odd. It's the correct result, but I can't see how
    [Daniel]
    Is it correct?

    No. :-) It should be 0, as you say.

    Given that "round(2, 2**31)" throws an OverflowError

    I think this is wrong, too. It should be 2. It's another consequence of
    the code in bltinmodule.c. The builtin_round function seems
    unnecessarily complicated: it converts the second argument from a
    Python object to a C int, then converts it back again before calling the
    appropriate __round__ method. Then the first thing the __round__ method
    typically does for built-in types is to convert to a C int again. As
    far as I can tell the first two conversions are unnecessary.

    Here's an updated version of the patch that does away with the
    unnecessary conversions, along with the UNDEF_NDIGITS hack. All tests
    still pass, on my machine, and with this patch I get the results I'd
    expect:

    >>> round(2, 2**31)
    2
    >>> round(2, 2**100)
    2
    >>> round(2, -2**100)
    ^CTraceback (most recent call last):
      File "<stdin>", line 1, in <module>
    KeyboardInterrupt
    >> round(2, 1-2**31)
    ^CTraceback (most recent call last):
      File "<stdin>", line 1, in <module>
    KeyboardInterrupt

    That should be optimizable for ndigits > number, and perhaps
    log10(number) < k * ndigits (for large ndigits), right? But I don't
    think it's a realworld usecase, so dropping this idea for 2.6.

    Agreed. I don't think this optimization is worth it.

    @rhettinger rhettinger removed their assignment Dec 22, 2008
    @orsenthil
    Copy link
    Member

    I also think that documentation should be improved for
    round(number, [ndigits]). The doc/help says, ndigits can be negative,
    but does not really say what it means to be negative.

    It can be confusing to anyione. Take the following example:
    >>> round(26.5,1)
    26.5
    >>> round(26.5,-1)
    30.0

    @mdickinson
    Copy link
    Member

    Clearer title.

    @mdickinson mdickinson changed the title round() shows undocumented behaviour round(25, 1) should return an integer, not a float Jan 26, 2009
    @mdickinson
    Copy link
    Member

    Some minor modifications to the last patch:

    • fix round docstring: it now reads "round(number[, ndigits]) -> number"
      instead of "round(number[, ndigits]) -> floating-point number"
    • add Misc/NEWS entry
    • add extra tests for round(x, n) with n huge and positive

    @rhettinger
    Copy link
    Contributor

    Someone ought to ping Guido on this. He may have some strong thoughts
    on the signature (having it sometimes return ints and sometimes floats).
    But in this case, accuracy may be the more important consideration.

    @rhettinger
    Copy link
    Contributor

    Guido, do you have any thoughts on this?

    Basically, it's a choice between giving round() a weird signature
    (sometimes returning floats and sometimes int/longs) versus having
    accurate roundings of integers (which become unrepresentable when
    divided by powers of 10.0).

    @gvanrossum
    Copy link
    Member

    I think it's fine if it returns an int iff the first arg is an int. In
    other languages this would be overloaded as follows:

    round(int)int
    round(float)float
    round(int, int)int
    round(float, int)float

    @gvanrossum gvanrossum removed their assignment Jan 28, 2009
    @lambertdw
    Copy link
    Mannequin

    lambertdw mannequin commented Jan 28, 2009

    I'd prefer round(x,positive_integer) return float. Returning int is a
    bit of a lie, except that the decimal module is available to avoid this
    sort of lie.

    For non-positive integer roundings I'd like an integer return.

    In my opinion, we'd benefit from this definition of round:

    import numbers
    
    def round(a,p=0,base=10):
        '''
            >>> round(147,-1,5)
            145.0
            >>> round(143,-1,5)
            145.0
            >>> round(142,-1,5)
            140.0
            >>> round(12.345,1,2)
            12.5
            >>> round(12.345,2,2)
            12.25
            >>> round(12.345,-2,2)
            12
            >>> round(12.345,-3,2)
            16
        '''
        # consider using sign transfer for negative a
    
        if base < 1:
            raise ValueError('base too confusing')
    
        require_integral_output = (
            (p < 1) and
            isinstance(base, numbers.Integral) and
            isinstance(p, numbers.Integral))
    
        b = base**p
        result = int(a*b+0.5)/b
        if require_integral_output:
            result = int(0.5+result)
        return result

    @gvanrossum
    Copy link
    Member

    Well, that would leave a function whose return *type* depends on the
    *value* of one of the arguments, which I find very ugly.

    I don't know why you think rounding an int to X (X>0) digits after the
    decimal point should return a float -- it's not like the int has any
    digits behind the point, so nothing is gained. Why would it be a lie?

    @lambertdw
    Copy link
    Mannequin

    lambertdw mannequin commented Jan 28, 2009

    The value of one of the arguments controls how many digits there are.

    Certainly if you rounded $10 to the nearest cents you'd expect $10.00.
    Thus round(10,2) should be 10.00. Without using decimal module, the
    best we can do is produce 10.0.

    I'd apply a similar argument to convince you that the return value
    should be integral for negative "number of digits".

    Hark! This is python. We can take this correct and beautiful approach.
    We are not constrainded by function signatures of c++ or FORTRAN90.

    @gvanrossum
    Copy link
    Member

    I'm sorry, but I don't see a significant difference between $10 and
    $10.00. If you want to display a certain number of digits, use "%.2f" %
    x. And trust me, I'm not doing this for Fortran's sake.

    @mdickinson
    Copy link
    Member

    For myself, I strongly prefer that round(int, int) return an integer in
    Python 3.x. Here ar\
    e three reasons:

    (1) Interoperability with Decimal (and, to a lesser extent, Fraction):
    the decimal module is\
    carefully designed so that Decimals interact well with integers. If
    round(int, int) return\
    s an integer then someone working with Decimals can use round freely,
    without worrying about\
    whether his or her numbers are actually integers or Decimals. If
    round(int, int) returns a\
    float then that someone is likely to get a nasty shock doing
    round(Decimal, int) when it tu\
    rns out that the Decimal was actually just a plain integer.

    (2) Accuracy: currently, for example, we have

    >>> round(10**16+1, 0)
    10000000000000000.0

    If the return type is changed to int, there's no need for this loss of
    accuracy.

    (3) Consistency: in 3.x, round(my_decimal, n) returns a Decimal;
    round(my_fraction, n) ret\
    urns a Fraction. That is, for all Python's numeric types other than
    int, the value returned\
    from two-argument round has the same type as the first argument.

    @vstinner
    Copy link
    Member

    marketdickinson> (2) Accuracy

    I see int/float like bytes/characters: it's not a good idea to mix
    them. If you use float, you know that you loose precision (digits)
    after some operations. Whereas I suppose the operations on int are
    always exact.

    I like round(int,int)->int (same input/output types). To get a float,
    use round(float(int), int)->float.

    @mdickinson
    Copy link
    Member

    Apologies for the poor formatting in the last comment. Bad
    cut-and-paste job.

    One more reason:

    (4) "In the face of ambiguity, refuse the temptation to guess."
    Why should round(int, int) be float, rather than Decimal, or Fraction?
    This was the one argument against the integer division change that I
    found somewhat compelling. But of course there's a big difference: 1/2
    had to have *some* type, and it couldn't be an integer. In contrast,
    given that round(34, n) is always integral, there's an obvious choice
    for the return type.

    I'll shut up now.

    @rhettinger
    Copy link
    Contributor

    It's settled then. Input type dictates output type. No dependency on
    the actual values.

    @mdickinson
    Copy link
    Member

    Committed in r69068 (py3k), r69069 (release30-maint).

    The original bug with round(float, n) (loss of accuracy arising from
    intermediate floating-point rounding errors) is still present; I think
    further discussion for that can go into bpo-1869 (which should probably
    have its priority upgraded).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants