Message86158
[Raymond]
> Is there a way to use SSE when available and x86 when it's not.
I guess it's possible in theory, but I don't know of any way to do this in
practice. I suppose one could trap the SIGILL generated by the attempted
use of an SSE2 instruction on a non-supported platform---is this how
things used to work for 386s without the 387? That would make a sequence
of floating-point instructions on non-SSE2 x86 horribly slow, though.
Antoine: as Raymond said, the advantage of SSE2 for numeric work is
accuracy, predictability, and consistency across platforms. The SSE2
instructions finally put an end to all the problems arising from the
mismatch between the precision of the x87 floating-point registers (64-
bits) and the precision of a C double (53-bits). Those difficulties
include (1) unpredictable rounding of intermediate values from 64-bit
precision to 53-bit precision, due to spilling of temporaries from FPU
registers to memory, and (2) double-rounding. The arithmetic of Python
itself is largely immune to the former, but not the latter. (And of
course the register spilling still causes headaches for various bits of
CPython).
Those difficulties can be *mostly* dealt with by setting the x87 rounding
precision to double (instead of extended), though this doesn't fix the
exponent range, so one still ends up with double-rounding on underflow.
The catch is that one can't mess with the x87 state globally, as various
library functions (especially libm functions) might depend on it being in whatever the OS considers to be the default state.
There's a very nice paper by David Monniaux that covers all this:
definitely recommended reading after you've finished Goldberg's "What
Every Computer Scientist...". It can currently be found at:
http://hal.archives-ouvertes.fr/hal-00128124/en/
An example: in Python (any version), try this:
>>> 1e16 + 2.9999
10000000000000002.0
On OS X, Windows and FreeBSD you'll get the answer above.
(OS X gcc uses SSE2 by default; Windows and FreeBSD both
make the default x87 rounding-precision 53 bits).
On 32-bit Linux/x86 or Solaris/x86 you'll likely get the answer
10000000000000004.0
instead, because Linux doesn't (usually?) change the Intel default
rounding precision of 64-bits. Using SSE2 instead of the x87 would have
fixed this.
</standard x87 rant> |
|
Date |
User |
Action |
Args |
2009-04-19 08:44:26 | mark.dickinson | set | recipients:
+ mark.dickinson, gvanrossum, tim.peters, nascheme, rhettinger, amaury.forgeotdarc, pitrou, eric.smith, christian.heimes, alexandre.vassalotti, noam, jaredgrubb, preston |
2009-04-19 08:44:25 | mark.dickinson | set | messageid: <1240130665.83.0.328550745196.issue1580@psf.upfronthosting.co.za> |
2009-04-19 08:44:24 | mark.dickinson | link | issue1580 messages |
2009-04-19 08:44:22 | mark.dickinson | create | |
|