Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport py3k float repr to trunk #51366

Closed
mdickinson opened this issue Oct 13, 2009 · 33 comments
Closed

Backport py3k float repr to trunk #51366

mdickinson opened this issue Oct 13, 2009 · 33 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@mdickinson
Copy link
Member

BPO 7117
Nosy @tim-one, @rhettinger, @amauryfa, @mdickinson, @vstinner, @ericvsmith, @voidspace
Files
  • round_fixup.patch: Backport of py3k round to trunk.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/mdickinson'
    closed_at = <Date 2009-11-25.10:41:11.777>
    created_at = <Date 2009-10-13.08:30:24.826>
    labels = ['interpreter-core', 'type-feature']
    title = 'Backport py3k float repr to trunk'
    updated_at = <Date 2011-06-30.11:17:01.710>
    user = 'https://github.com/mdickinson'

    bugs.python.org fields:

    activity = <Date 2011-06-30.11:17:01.710>
    actor = 'vstinner'
    assignee = 'mark.dickinson'
    closed = True
    closed_date = <Date 2009-11-25.10:41:11.777>
    closer = 'mark.dickinson'
    components = ['Interpreter Core']
    creation = <Date 2009-10-13.08:30:24.826>
    creator = 'mark.dickinson'
    dependencies = []
    files = ['15254']
    hgrepos = []
    issue_num = 7117
    keywords = ['patch']
    message_count = 33.0
    messages = ['93918', '94414', '94417', '94428', '94430', '94431', '94433', '94436', '94447', '94491', '94494', '94495', '94510', '94528', '94550', '94551', '94572', '94575', '94615', '94655', '94747', '94862', '95444', '95507', '95644', '95658', '95700', '95703', '95714', '139402', '139428', '139444', '139467']
    nosy_count = 7.0
    nosy_names = ['tim.peters', 'rhettinger', 'amaury.forgeotdarc', 'mark.dickinson', 'vstinner', 'eric.smith', 'michael.foord']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue7117'
    versions = ['Python 2.7']

    @mdickinson
    Copy link
    Member Author

    See the thread starting at:

    http://mail.python.org/pipermail/python-dev/2009-October/092958.html

    Eric suggested that we don't need a separate branch for this; sounds
    fine to me. It should still be possible to do the backport in stages,
    though. Something like the following?

    (1) Check in David Gay's code plus necessary build changes,
    configuration steps, etc; conversions still use the old code.

    (2) Switch to using the new code for float -> string (str, repr, float
    formatting) and string -> float conversions (float, complex
    constructors, numeric literals in Python code). [Substeps?]

    (3) Fix up builtin round function to use the new code.

    (4) Make any necessary fixes to the documentation. (Raymond, I assume
    you'll take care of the whatsnew changes when the time comes?)

    (1), (3) and (4) should be straightforward. (2) is where most of the
    work is, I think. I think it should be possible to do the stage (2)
    work in pieces without breaking too much.

    @mdickinson mdickinson self-assigned this Oct 13, 2009
    @mdickinson mdickinson added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Oct 13, 2009
    @mdickinson
    Copy link
    Member Author

    Some key revision numbers from the py3k short float repr, for reference:

    r71663: include Gay's code, build and configure fixes
    r71723: backout SSE2 detection code added in r71663
    r71665: rewrite of float formatting code to use Gay's code

    Backported most of r71663 and r71723 to trunk in:

    r75651: Add dtoa.c, dtoa.h, update license.
    r75658: configuration changes - detect float endianness,
    add functions to get and set x87 control word, and
    determine when short float repr can be used.

    Significant changes from r71663 not yet included:

    @mdickinson
    Copy link
    Member Author

    r75666: Add sys.float_repr_style attribute.

    @mdickinson
    Copy link
    Member Author

    r75672: temporarily disable the short float repr while we're putting
    the pieces in place.

    When testing, the disablement can be disabled (hah) by defining the
    PY_SHORT_FLOAT_REPR preprocessor symbol, e.g. (on Unix) with

    CC='gcc -DPY_SHORT_FLOAT_REPR' ./configure && make

    @ericvsmith
    Copy link
    Member

    I think the next step on my side is to remove _PyOS_double_to_string,
    and make all of the internal code call PyOS_double_to_string. The
    distinction is that _PyOS_double_to_string gets passed a buffer and
    length, but PyOS_double_to_string returns allocated memory that must be
    freed. David Gay's code (_Py_dg_dtoa) returns allocated memory, so
    that's the easiest interface to propagate internally.

    That's the approach we used in the py3k branch. I'll start work on it.
    So Mark's work should be mostly config stuff and hooking up Gay's code
    to PyOS_double_to_string. I think it will basically match the py3k version.

    The existing _PyOS_double_to_string will become the basis for the
    fallback code for use when PY_NO_SHORT_FLOAT_REPR is defined (and it
    will then be renamed PyOS_double_to_string and have its signature
    changed to match).

    @mdickinson
    Copy link
    Member Author

    One issue occurs to me: should the backport change the behaviour of the
    round function?

    In py3k, round consistently uses round-half-to-even for halfway cases.
    In trunk, round semi-consistently uses round-half-away-from-zero (and
    this is documented). E.g., round(1.25, 1) will give 1.2 in py3k and
    (usually) 1.3 in trunk.

    I definitely want to use Gay's code for round in 2.7, since having round
    work sensibly is part of the motivation for the backport in the first
    place. But this naturally leads to a round-half-to-even version of
    round, since the Python-adapted version of Gay's code isn't capable of
    doing anything other than round-half-to-even at the moment.

    Options:

    (1) change round in 2.7 to do round-half-to-even. This is easy,
    natural, and means that round will agree with float formatting
    (which does round-half-to-even in both py3k and trunk). But it
    may break existing applications. However: (a) those applications
    would need fixing anyway to work with py3k, and (b) I have little
    sympathy for people depending on behaviour of rounding of
    *binary* floats for halfway *decimal* cases. (Decimal is another
    matter, of course: there it's perfectly reasonable to expect
    guaranteed rounding behaviour.)

    It's more complicated than that, though, since if rounding
    becomes round-half-to-even for floats, it should also change
    for integers, Fractions, etc.
    

    (2) have round stick with round-half-away-from-zero. This may be
    awkward to implement (though I have some half-formed ideas about
    how to make it work), and would lead to round occasionally not
    agreeing with float formatting. For example:

        >>> '{0:.1f}'.format(1.25)
        '1.2'
        >>> round(1.25, 1)
        1.3

    @ericvsmith
    Copy link
    Member

    Adding tim_one as nosy. He'll no doubt have an opinion on rounding. And
    hopefully he'll share it!

    @ericvsmith
    Copy link
    Member

    Another thing to consider is that in py3k we removed all conversions of
    converting 'f' to 'g', such as this, from Objects/unicodeobject.c:

    if (type == 'f' && fabs(x) >= 1e50)
        type = 'g';
    

    Should we also do that as part of this exercise? Or should it be another
    issue, or not done at all?

    @rhettinger
    Copy link
    Contributor

    +1 on backporting the 'f' and 'g' work also.
    We will be well served by getting the two
    code bases ins-sync with one another.

    Eliminating obscure differences makes it easier
    to port code from 2.x to 3.x

    @mdickinson
    Copy link
    Member Author

    r75720: Backport py3k version of pystrtod.c to trunk. There are still
    some (necessary) differences between the two versions, which
    should become unnecessary once everything else is hooked up.
    The differences should be re-examined later.

    @mdickinson
    Copy link
    Member Author

    [Eric, on removing f to g conversions]

    Should we also do that as part of this exercise? Or should it be another
    issue, or not done at all?

    I'd definitely like to remove the f to g conversion in trunk. I don't
    see any great need to open a separate issue for that. (Was there one
    already for the py3k removal?)

    @mdickinson
    Copy link
    Member Author

    Found it: issue bpo-5859 was opened for the removal of the f -> g conversion
    in py3k. We could just add a note to that issue.

    @mdickinson
    Copy link
    Member Author

    r75730: backport pystrtod.h
    r75731: Fix floatobject.c to use PyOS_string_to_double.

    @mdickinson
    Copy link
    Member Author

    r75739: Fix complexobject.c to use PyOS_string_to_double.

    @ericvsmith
    Copy link
    Member

    r75743: Fix cPickle.c to use PyOS_string_to_double.

    @ericvsmith
    Copy link
    Member

    r75745: Fix stropmodule.c to use PyOS_string_to_double.

    @ericvsmith
    Copy link
    Member

    r75824: Fix ast.c to use PyOS_string_to_double.

    @ericvsmith
    Copy link
    Member

    r75846: Fix marshal.c to use PyOS_string_to_double.

    @ericvsmith
    Copy link
    Member

    r75913: Fix _json.c to use PyOS_string_to_double. Change made after
    consulting with Bob Ippolito.

    This completes the removal of calls to PyOS_ascii_strtod.

    @mdickinson
    Copy link
    Member Author

    The next job is to deprecate PyOS_ascii_atof and PyOS_ascii_strtod, I
    think. I'll get to work on that.

    @mdickinson
    Copy link
    Member Author

    r75979: Deprecate PyOS_ascii_atof and PyOS_ascii_strtod; document
    PyOS_double_to_string.

    @mdickinson
    Copy link
    Member Author

    Here's a patch for correctly-rounded round in trunk. This patch doesn't
    change the rounding behaviour between 2.6 and 2.7: it's still doing
    round-half-away-from-zero instead of round-half-even. It was necessary to
    detect and treat halfway cases specially to make this work. Removing this
    special case code would be easy, so we can decide later whether it's worth
    changing round to do round-half-to-even for 2.7.

    I want to let this sit for a couple of days before I apply it.

    @mdickinson
    Copy link
    Member Author

    r76373: Backport round.

    @mdickinson
    Copy link
    Member Author

    Short float repr is now enabled in r76379.

    Misc/NEWS entries added/updated in r76411.

    @mdickinson
    Copy link
    Member Author

    r76465 removes the fixed-length buffer for formatting floats, hence
    removes the restriction on the precision. This should make removal of
    the %f -> %g switch straightforward.

    @mdickinson
    Copy link
    Member Author

    r76474: Remove %f -> %g switch.

    @mdickinson
    Copy link
    Member Author

    I think we're pretty much done here.

    I'd still like to produce a more complete set of float formatting test
    cases at some point (for both trunk and py3k), but that's a separate
    activity.

    Eric, Raymond: can you spot anything we've missed?

    @ericvsmith
    Copy link
    Member

    Thanks for tackling the last few bits, Mark. I think we're done,
    although I admit I haven't verified what state the documentation is in.

    I suggest we close this issue and if any problems occur open them as new
    issues.

    @mdickinson
    Copy link
    Member Author

    Thanks, Eric.

    The only remaining documentation issues I'm aware of are in
    Doc/tutorial/floatingpoint.rst. I think Raymond is going to update this
    to match the py3k version.

    I'll call this done, then! Thanks for all your help.

    @voidspace
    Copy link
    Contributor

    Wondered if you guys had heard of some recent advances in the state of the art in this field. I'm sure you have, but thought I'd link it here anywhere.

    Quote taken from this article (which links to relevant papers):

    http://www.serpentine.com/blog/2011/06/29/here-be-dragons-advances-in-problems-you-didnt-even-know-you-had/

    In 2010, Florian Loitsch published a wonderful paper in PLDI, "Printing floating-point numbers quickly and accurately with integers", which represents the biggest step in this field in 20 years: he mostly figured out how to use machine integers to perform accurate rendering! Why do I say "mostly"? Because although Loitsch's "Grisu3" algorithm is very fast, it gives up on about 0.5% of numbers, in which case you have to fall back to Dragon4 or a derivative.

    If you're a language runtime author, the Grisu algorithms are a big deal: Grisu3 is about 5 times faster than the algorithm used by printf in GNU libc, for instance. A few language implementors have already taken note: Google hired Loitsch, and the Grisu family acts as the default rendering algorithms in both the V8 and Mozilla Javascript engines (replacing David Gay's 17-year-old dtoa code). Loitsch has kindly released implementations of his Grisu algorithms as a library named double-conversion.

    @mdickinson
    Copy link
    Member Author

    Hadn't seen that. Interesting!

    @rhettinger
    Copy link
    Contributor

    Thanks for the link :-)

    @amauryfa
    Copy link
    Member

    I've filed bpo-12450 to track this last idea.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants