Message 86048 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mark.dickinson
Recipients	alexandre.vassalotti, amaury.forgeotdarc, christian.heimes, eric.smith, gvanrossum, jaredgrubb, mark.dickinson, nascheme, noam, preston, rhettinger, tim.peters
Date	2009-04-16.21:52:35
SpamBayes Score	0.0
Marked as misclassified	No
Message-id	<1239918759.55.0.086597935097.issue1580@psf.upfronthosting.co.za>
In-reply-to

Content
The py3k-short-float-repr branch has been merged to py3k in two parts: r71663 is mostly concerned with the inclusion of David Gay's code into the core, and the necessary floating-point fixups to allow Gay's code to be used (SSE2 detection, x87 control word manipulation, etc.) r71665 contains Eric's mammoth rewrite and upgrade of the all the float formatting code to use the new _Py_dg_dtoa and _Py_dg_strtod functions. Note: the new code doesn't give short float repr on all platforms, though it's close. The sticking point is that Gay's code needs 53-bit rounding precision, and the x87 FPU uses 64-bit rounding precision by default (though some operating systems---e.g., FreeBSD, Windows---change that default). For the record, here's the strategy that I used: feedback (esp. from experts) would be appreciated. - If the float format is not IEEE 754, don't use Gay's code. - Otherwise, if we're not on x86, we're probably fine. (Historically, there are other FPUs that have similar problems---e.g., the Motorola 68881/2 FPUs---but I think they're all too old to be worth worrying about by now.) - x86-64 in 64-bit mode is also fine: there, SSE2 is the default. x86-64 in 32-bit mode (e.g., 32-bit Linux on Core 2 Duo) has the same problems as plain x86. (OS X is fine: its gcc has SSE2 enabled by default even for 32-bit mode.) - Windows/x86 appears to set rounding precision to 53-bits by default, so we're okay there too. So: - On gcc/x86, detect the availability of SSE2 (by examining the result of the cpuid instruction) and add the appropriate flags (-msse2 -mfpmath=sse2) to BASECFLAGS if SSE2 is available. - On gcc/x86, if SSE2 is not available, so that we're using the x87 FPU, use inline assembler to set the rounding precision (and rounding mode) before calling Gay's code, and restore the FPU state directly afterwards. Use of inline assembler is pretty horrible, but it seems to be more portable than any of the alternatives. The official C99 way is to use fegetenv/fesetenv to get and set the floating-point environment, but the fenv_t type used to store the environment can (and will) vary from platform to platform. - There's an autoconf test for double-rounding. If there's no evidence of double rounding then it's likely to be safe to use Gay's code: double rounding is an almost unavoidable symptom of 64-bit precision on x87. So on non-Windows x86 platforms that aren't using gcc and do exhibit double rounding (implying that they're not using SSE2, and that the OS doesn't change the FPU rounding precision to 53 bits), we're out of luck. In this case the old long float repr is used. The most prominent platform that I can think of that's affected by this would be something like Solaris/x86 with Sun's own compiler, or more generally any Unix/x86 combination where the compiler isn't gcc. Those platforms need to be dealt with on a case-by-case basis, by figuring out for each such platform how to detect and use SSE2, and how to get and set the x87 control word if SSE2 instructions aren't available. Note that if any of the above heuristics is wrong and we end up using Gay's code inappropriately, then there will be loud failure: we'll know about it.

The py3k-short-float-repr branch has been merged to py3k in two parts:

r71663 is mostly concerned with the inclusion of David Gay's code into the 
core, and the necessary floating-point fixups to allow Gay's code to be 
used (SSE2 detection, x87 control word manipulation, etc.)

r71665 contains Eric's *mammoth* rewrite and upgrade of the all the float 
formatting code to use the new _Py_dg_dtoa and _Py_dg_strtod functions.

Note: the new code doesn't give short float repr on *all* platforms, 
though it's close.  The sticking point is that Gay's code needs 53-bit 
rounding precision, and the x87 FPU uses 64-bit rounding precision by 
default (though some operating systems---e.g., FreeBSD, Windows---change 
that default).  For the record, here's the strategy that I used:  feedback 
(esp. from experts) would be appreciated.

- If the float format is not IEEE 754, don't use Gay's code.

- Otherwise, if we're not on x86, we're probably fine.
  (Historically, there are other FPUs that have similar
  problems---e.g., the Motorola 68881/2 FPUs---but I think they're
  all too old to be worth worrying about by now.)

- x86-64 in 64-bit mode is also fine: there, SSE2 is the default.
  x86-64 in 32-bit mode (e.g., 32-bit Linux on Core 2 Duo) has
  the same problems as plain x86.  (OS X is fine: its gcc has
  SSE2 enabled by default even for 32-bit mode.)

- Windows/x86 appears to set rounding precision to 53-bits by default,
  so we're okay there too.

So:

- On gcc/x86, detect the availability of SSE2 (by examining
  the result of the cpuid instruction) and add the appropriate
  flags (-msse2 -mfpmath=sse2) to BASECFLAGS if SSE2 is available.

- On gcc/x86, if SSE2 is *not* available, so that we're using the
  x87 FPU, use inline assembler to set the rounding precision
  (and rounding mode) before calling Gay's code, and restore
  the FPU state directly afterwards.   Use of inline assembler is pretty
  horrible, but it seems to be *more* portable than any of the
  alternatives.  The official C99 way is to use fegetenv/fesetenv to get
  and set the floating-point environment, but the fenv_t type used to
  store the environment can (and will) vary from platform to platform.

- There's an autoconf test for double-rounding.  If there's no
  evidence of double rounding then it's likely to be safe to use
  Gay's code: double rounding is an almost unavoidable symptom of
  64-bit precision on x87.

So on non-Windows x86 platforms that *aren't* using gcc and *do* exhibit
double rounding (implying that they're not using SSE2, and that the OS 
doesn't change the FPU rounding precision to 53 bits), we're out of luck.  
In this case the old long float repr is used.

The most prominent platform that I can think of that's affected by this 
would be something like Solaris/x86 with Sun's own compiler, or more 
generally any Unix/x86 combination where the compiler isn't gcc.  Those 
platforms need to be dealt with on a case-by-case basis, by figuring out 
for each such platform how to detect and use SSE2, and how to get and set 
the x87 control word if SSE2 instructions aren't available.

Note that if any of the above heuristics is wrong and we end up using 
Gay's code inappropriately, then there will be *loud* failure:  we'll know 
about it.

History
Date	User	Action	Args
2009-04-16 21:52:39	mark.dickinson	set	recipients: + mark.dickinson, gvanrossum, tim.peters, nascheme, rhettinger, amaury.forgeotdarc, eric.smith, christian.heimes, alexandre.vassalotti, noam, jaredgrubb, preston
2009-04-16 21:52:39	mark.dickinson	set	messageid: <1239918759.55.0.086597935097.issue1580@psf.upfronthosting.co.za>
2009-04-16 21:52:38	mark.dickinson	link	issue1580 messages
2009-04-16 21:52:36	mark.dickinson	create