classification
Title: Backport py3k float repr to trunk
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: amaury.forgeotdarc, eric.smith, haypo, mark.dickinson, michael.foord, rhettinger, tim.peters
Priority: normal Keywords: patch

Created on 2009-10-13 08:30 by mark.dickinson, last changed 2011-06-30 11:17 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
round_fixup.patch mark.dickinson, 2009-11-03 16:15 Backport of py3k round to trunk.
Messages (33)
msg93918 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-13 08:30
See the thread starting at:

http://mail.python.org/pipermail/python-dev/2009-October/092958.html

Eric suggested that we don't need a separate branch for this; sounds
fine to me.  It should still be possible to do the backport in stages,
though.  Something like the following?

(1) Check in David Gay's code plus necessary build changes,
configuration steps, etc;  conversions still use the old code.

(2) Switch to using the new code for float -> string (str, repr, float
formatting) and string -> float conversions (float, complex
constructors, numeric literals in Python code).  [Substeps?]

(3) Fix up builtin round function to use the new code.

(4) Make any necessary fixes to the documentation.  (Raymond, I assume
you'll take care of the whatsnew changes when the time comes?)

(1), (3) and (4) should be straightforward.  (2) is where most of the
work is, I think.  I think it should be possible to do the stage (2)
work in pieces without breaking too much.
msg94414 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-24 13:41
Some key revision numbers from the py3k short float repr, for reference:

r71663:  include Gay's code, build and configure fixes
r71723:  backout SSE2 detection code added in r71663
r71665:  rewrite of float formatting code to use Gay's code

Backported most of r71663 and r71723 to trunk in:

r75651: Add dtoa.c, dtoa.h, update license.
r75658: configuration changes - detect float endianness,
        add functions to get and set x87 control word, and
        determine when short float repr can be used.

Significant changes from r71663 not yet included:

* Misc/NEWS update
* Lib/test/formatfloat_testcases.txt needs updating to match py3k.
msg94417 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-24 14:06
r75666: Add sys.float_repr_style attribute.
msg94428 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-24 16:02
r75672:  temporarily disable the short float repr while we're putting
         the pieces in place.

When testing, the disablement can be disabled (hah) by defining the 
PY_SHORT_FLOAT_REPR preprocessor symbol, e.g. (on Unix) with

CC='gcc -DPY_SHORT_FLOAT_REPR' ./configure && make
msg94430 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-24 17:04
I think the next step on my side is to remove _PyOS_double_to_string,
and make all of the internal code call PyOS_double_to_string. The
distinction is that _PyOS_double_to_string gets passed a buffer and
length, but  PyOS_double_to_string returns allocated memory that must be
freed. David Gay's code (_Py_dg_dtoa) returns allocated memory, so
that's the easiest interface to propagate internally.

That's the approach we used in the py3k branch. I'll start work on it.
So Mark's work should be mostly config stuff and hooking up Gay's code
to PyOS_double_to_string. I think it will basically match the py3k version.

The existing _PyOS_double_to_string will become the basis for the
fallback code for use when PY_NO_SHORT_FLOAT_REPR is defined (and it
will then be renamed PyOS_double_to_string and have its signature
changed to match).
msg94431 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-24 18:15
One issue occurs to me: should the backport change the behaviour of the 
round function?

In py3k, round consistently uses round-half-to-even for halfway cases.
In trunk, round semi-consistently uses round-half-away-from-zero (and 
this is documented).  E.g., round(1.25, 1) will give 1.2 in py3k and 
(usually) 1.3 in trunk.

I definitely want to use Gay's code for round in 2.7, since having round 
work sensibly is part of the motivation for the backport in the first 
place.  But this naturally leads to a round-half-to-even version of 
round, since the Python-adapted version of Gay's code isn't capable of 
doing anything other than round-half-to-even at the moment.

Options:

(1) change round in 2.7 to do round-half-to-even.  This is easy,
    natural, and means that round will agree with float formatting
    (which does round-half-to-even in both py3k and trunk).  But it
    may break existing applications.  However:  (a) those applications
    would need fixing anyway to work with py3k, and (b) I have little
    sympathy for people depending on behaviour of rounding of
    *binary* floats for halfway *decimal* cases.  (Decimal is another
    matter, of course:  there it's perfectly reasonable to expect
    guaranteed rounding behaviour.)

    It's more complicated than that, though, since if rounding
    becomes round-half-to-even for floats, it should also change
    for integers, Fractions, etc.

(2) have round stick with round-half-away-from-zero.  This may be
    awkward to implement (though I have some half-formed ideas about
    how to make it work), and would lead to round occasionally not
    agreeing with float formatting.  For example:

    >>> '{0:.1f}'.format(1.25)
    '1.2'
    >>> round(1.25, 1)
    1.3
msg94433 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-24 18:29
Adding tim_one as nosy. He'll no doubt have an opinion on rounding. And
hopefully he'll share it!
msg94436 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-24 21:17
Another thing to consider is that in py3k we removed all conversions of
converting 'f' to 'g', such as this, from Objects/unicodeobject.c:

    if (type == 'f' && fabs(x) >= 1e50)
        type = 'g';

Should we also do that as part of this exercise? Or should it be another
issue, or not done at all?
msg94447 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-10-25 05:56
+1 on backporting the 'f' and 'g' work also.
We will be well served by getting the two
code bases ins-sync with one another.

Eliminating obscure differences makes it easier
to port code from 2.x to 3.x
msg94491 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-26 16:03
r75720: Backport py3k version of pystrtod.c to trunk.  There are still
        some (necessary) differences between the two versions, which
        should become unnecessary once everything else is hooked up.
        The differences should be re-examined later.
msg94494 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-26 17:01
[Eric, on removing f to g conversions]
> Should we also do that as part of this exercise? Or should it be another
> issue, or not done at all?

I'd definitely like to remove the f to g conversion in trunk.  I don't 
see any great need to open a separate issue for that.  (Was there one 
already for the py3k removal?)
msg94495 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-26 17:03
Found it: issue #5859 was opened for the removal of the f -> g conversion 
in py3k.  We could just add a note to that issue.
msg94510 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-26 21:19
r75730: backport pystrtod.h
r75731: Fix floatobject.c to use PyOS_string_to_double.
msg94528 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-26 22:30
r75739: Fix complexobject.c to use PyOS_string_to_double.
msg94550 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-27 11:37
r75743: Fix cPickle.c to use PyOS_string_to_double.
msg94551 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-27 12:13
r75745: Fix stropmodule.c to use PyOS_string_to_double.
msg94572 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-27 18:37
r75824: Fix ast.c to use PyOS_string_to_double.
msg94575 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-27 19:43
r75846: Fix marshal.c to use PyOS_string_to_double.
msg94615 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-10-28 08:46
r75913: Fix _json.c to use PyOS_string_to_double. Change made after
consulting with Bob Ippolito.

This completes the removal of calls to PyOS_ascii_strtod.
msg94655 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-29 10:17
The next job is to deprecate PyOS_ascii_atof and PyOS_ascii_strtod, I 
think.  I'll get to work on that.
msg94747 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-10-31 09:44
r75979:  Deprecate PyOS_ascii_atof and PyOS_ascii_strtod;  document
         PyOS_double_to_string.
msg94862 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-03 16:15
Here's a patch for correctly-rounded round in trunk.  This patch doesn't 
change the rounding behaviour between 2.6 and 2.7:  it's still doing 
round-half-away-from-zero instead of round-half-even.  It was necessary to 
detect and treat halfway cases specially to make this work.  Removing this 
special case code would be easy, so we can decide later whether it's worth 
changing round to do round-half-to-even for 2.7.

I want to let this sit for a couple of days before I apply it.
msg95444 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-18 19:35
r76373:  Backport round.
msg95507 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-19 18:44
Short float repr is now enabled in r76379.

Misc/NEWS entries added/updated in r76411.
msg95644 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-23 18:48
r76465 removes the fixed-length buffer for formatting floats, hence
removes the restriction on the precision.   This should make removal of 
the %f -> %g switch straightforward.
msg95658 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-23 20:56
r76474: Remove %f -> %g switch.
msg95700 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-24 21:46
I think we're pretty much done here.

I'd still like to produce a more complete set of float formatting test 
cases at some point (for both trunk and py3k), but that's a separate 
activity.

Eric, Raymond:  can you spot anything we've missed?
msg95703 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009-11-24 22:46
Thanks for tackling the last few bits, Mark. I think we're done,
although I admit I haven't verified what state the documentation is in.

I suggest we close this issue and if any problems occur open them as new
issues.
msg95714 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2009-11-25 10:41
Thanks, Eric.

The only remaining documentation issues I'm aware of are in 
Doc/tutorial/floatingpoint.rst.  I think Raymond is going to update this 
to match the py3k version.

I'll call this done, then!  Thanks for all your help.
msg139402 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2011-06-29 09:48
Wondered if you guys had heard of some recent advances in the state of the art in this field. I'm sure you have, but thought I'd link it here anywhere.

Quote taken from this article (which links to relevant papers):

http://www.serpentine.com/blog/2011/06/29/here-be-dragons-advances-in-problems-you-didnt-even-know-you-had/

In 2010, Florian Loitsch published a wonderful paper in PLDI, "Printing floating-point numbers quickly and accurately with integers", which represents the biggest step in this field in 20 years: he mostly figured out how to use machine integers to perform accurate rendering! Why do I say "mostly"? Because although Loitsch's "Grisu3" algorithm is very fast, it gives up on about 0.5% of numbers, in which case you have to fall back to Dragon4 or a derivative.

If you're a language runtime author, the Grisu algorithms are a big deal: Grisu3 is about 5 times faster than the algorithm used by printf in GNU libc, for instance. A few language implementors have already taken note: Google hired Loitsch, and the Grisu family acts as the default rendering algorithms in both the V8 and Mozilla Javascript engines (replacing David Gay's 17-year-old dtoa code). Loitsch has kindly released implementations of his Grisu algorithms as a library named double-conversion.
msg139428 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-06-29 15:23
Hadn't seen that.  Interesting!
msg139444 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2011-06-29 18:22
Thanks for the link :-)
msg139467 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2011-06-30 08:19
I've filed issue12450 to track this last idea.
History
Date User Action Args
2011-06-30 11:17:01hayposetnosy: + haypo
2011-06-30 08:19:33amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg139467
2011-06-29 18:22:50rhettingersetmessages: + msg139444
2011-06-29 15:23:24mark.dickinsonsetmessages: + msg139428
2011-06-29 09:48:42michael.foordsetnosy: + michael.foord
messages: + msg139402
2009-11-25 10:41:11mark.dickinsonsetstatus: open -> closed
resolution: accepted
messages: + msg95714

stage: resolved
2009-11-24 22:46:41eric.smithsetmessages: + msg95703
2009-11-24 21:46:20mark.dickinsonsetmessages: + msg95700
2009-11-23 20:56:26mark.dickinsonsetmessages: + msg95658
2009-11-23 18:48:26mark.dickinsonsetmessages: + msg95644
2009-11-19 18:44:41mark.dickinsonsetmessages: + msg95507
2009-11-18 19:35:41mark.dickinsonsetmessages: + msg95444
2009-11-03 16:15:16mark.dickinsonsetfiles: + round_fixup.patch
keywords: + patch
messages: + msg94862
2009-10-31 09:44:47mark.dickinsonsetmessages: + msg94747
2009-10-29 10:17:42mark.dickinsonsetmessages: + msg94655
2009-10-28 08:46:29eric.smithsetmessages: + msg94615
2009-10-27 19:43:22eric.smithsetmessages: + msg94575
2009-10-27 18:37:04eric.smithsetmessages: + msg94572
2009-10-27 12:13:34eric.smithsetmessages: + msg94551
2009-10-27 11:37:44eric.smithsetmessages: + msg94550
2009-10-26 22:30:02mark.dickinsonsetmessages: + msg94528
2009-10-26 21:19:17mark.dickinsonsetmessages: + msg94510
2009-10-26 17:03:15mark.dickinsonsetmessages: + msg94495
2009-10-26 17:01:09mark.dickinsonsetmessages: + msg94494
2009-10-26 16:03:25mark.dickinsonsetmessages: + msg94491
2009-10-25 05:56:18rhettingersetmessages: + msg94447
2009-10-24 21:17:23eric.smithsetmessages: + msg94436
2009-10-24 18:29:43eric.smithsetnosy: + tim.peters
messages: + msg94433
2009-10-24 18:15:40mark.dickinsonsetmessages: + msg94431
2009-10-24 17:04:39eric.smithsetmessages: + msg94430
2009-10-24 16:02:51mark.dickinsonsetmessages: + msg94428
2009-10-24 14:06:27mark.dickinsonsetmessages: + msg94417
2009-10-24 13:41:04mark.dickinsonsetmessages: + msg94414
2009-10-13 08:30:25mark.dickinsoncreate