classification
Title: Sin(x) is Wrong
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 3.1
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: derekroconnor, eric.smith, isandler, mark.dickinson, tim.peters
Priority: normal Keywords:

Created on 2010-04-04 01:04 by derekroconnor, last changed 2010-04-04 19:45 by derekroconnor. This issue is now closed.

Messages (8)
msg102312 - (view) Author: Derek O'Connor (derekroconnor) Date: 2010-04-04 01:04
Dell Precision 690, Intel 2xQuad-Core E5345  @ 2.33GHz
16GB RAM, Windows7 64-bit Professional.

Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) 
[MSC v.1500 32 bit (Intel)] on win32

>>> from math import sin
>>> sin(1e22)
0.41214336710708466   (WRONG)
>>>

Python 3.1.2 (r312:79149, Mar 20 2010, 22:55:39)
[MSC v.1500 64 bit (AMD64)] on win32

>>> from math import sin
>>> sin(1e22)
-0.8522008497671888  (CORRECT)

The correct result, rounded to 20 digits is

 sin(10^22) = -8.5220 08497 67188 80177 e-001

Please note that 10^22 is exactly representable as an IEEE double precision floating point number, whose binary representation is

10^22 = 1000011110000110011110000011001001101110101011001001 x 2^22

Hence fl(10^22) = 10^22, whereas fl(10^23) ~= 10^23.

This incorrect result suggests that the range-reduction step, where the argument x is reduced to lie in a small interval around 0, such as [-pi/2,+pi/2], has not been done properly. This means that the other trigonometric functions will be incorrect.

Yours sincerely,

Derek O'Connor
msg102319 - (view) Author: Ilya Sandler (isandler) Date: 2010-04-04 04:36
I believe python is fully at mercy of underlying system math library.

And as a matter of fact, this C program

#include <math.h>
#include <stdio.h>
main() { printf("%.6f\n", sin(1e22)); }

when compiled as 64-bit app (on linux) produces "-0.852201", but when it's compiled as a 32-bit app on the same machine (gcc -m32), it produces "0.462613".

So I don't think this is a bug in python.
msg102324 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-04 08:17
What Ilya Sandler said!

Computing sin or cos with large arguments requires high precision for the intermediate calculations (e.g., for sin(1e22) you'd need around 40 digits of precision for the reduction step), so most math libraries don't bother.  

This is also the reason that Intel's x87 FSIN instruction (which may or may not be being used here) requires that the argument be in the range -2**63 to 2**63.

Short of Python using its own math library, or adding a dependence on a correctly rounded math library like MPFR or crlibm, there's no real way to fix this.

Out of curiosity, what application do you have that requires evaluating sin for such large arguments?
msg102329 - (view) Author: Derek O'Connor (derekroconnor) Date: 2010-04-04 09:44
Reply to Mark Dickinson

Python 3.1.2 -- 32 bit gives sin(2^60) = -0.7391806966492228
PariGp 2.3.4           gives sin(2^60) = -0.8306492176372546505752817956

So it seems Intel's x87 FSIN is not being used.

Application? I don't have one, but it is not too hard to imagine that buried deep in a large simulation, sin(x) is presented with a large argument, and whose wrong result is never noticed until ...

I quote from Ng's paper :
http://www.derekroconnor.net/Software/Ng--ArgReduction.pdf
--------------------------
"It is often argued that being concerned  about  large  arguments
is  unnecessary, because sophisticated users simply know better 
than to compute with large  angles.  It  is  our contention that this position is suboptimal, because:

    1. It places an unnecessary burden on the user.

    2. The consequences of producing incorrect (inaccurate)answers
       may be catastrophic;  many people assume that computers can
       do arithmetic very well.  While  numerical  analysts know better,
       not all programmers are numerical analysts, nor should they be.

    3. It is a vendors responsibility  to  provide  answers that
       are as correct as possible."
-------------------

Derek O'Connor
msg102330 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-04 09:59
> Python 3.1.2 -- 32 bit gives sin(2^60) = -0.7391806966492228

Interesting.  I get 

>>> sin(2.**60)
-0.83064921763725463

On both OS X 10.6.3 (64-bit build) and OS X 10.5.something (32-bit build).  But this is all down to the platform math library, of course.

I agree it's not ideal, but it's not really in Python's power to fix this, without adding a huge extra maintenance burden (from adding the necessary high-precision arithmetic to Python itself) or another external dependency.

You could always try to come up with a patch that adds a configure option for linking Python's math library into MPFR instead of the system math library.  (crlibm would also work, and would be faster;  it doesn't seem to be so well maintained these days though;  Pari/GP isn't such a great option since its floating-point model isn't so compatible with IEEE 754).
msg102343 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2010-04-04 13:48
At heart, this is really the same kind of thing that eventually prodded Python into adopting David Gay's burdensome ;-) code for correctly rounded double<->string I/O conversions.  We could similarly take on porting and maintaining KC Ng's fdlibm (an open source libm implementation Ng developed while at Sun - it doesn't guarantee correct rounding, but does guarantee max error strictly less than 1 ULP in all cases).

The tradeoffs are roughly similar.  For example, fdlibm is typically much slower than the platform C's libm, and most people using math libraries in anger are much keener about speed than avoiding noise results for inputs in ranges they never intend to use.  Perhaps surprisingly, fdlibm isn't "so slow" primarily because it's aiming at max 1 ULP error, but because it was coded to be as portable as possible.  Platform-specific libm implementations play every trick in the book (& invented many of the tricks in the book to begin with!) to squeeze out every cycle from the specific hardware they're intended to run on.  That makes "fast versus accurate versus portable" very much a "pick at most two" proposition for libm.
msg102344 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-04-04 14:11
> We could similarly take on porting and maintaining KC Ng's fdlibm

Gulp.

If we did that, I'd definitely want to push for dropping Python support for non-IEEE 754 systems.

I'm not sure I've fully recovered from the dtoa.c addition yet.  :)
msg102348 - (view) Author: Derek O'Connor (derekroconnor) Date: 2010-04-04 19:45
@ Tim Peters

" ... are much keener about speed than avoiding noise results for inputs in ranges they *never intend to use* ".

It is the *unintended use* that worries me. Sadly, "Speed is King". Perhaps that aphorism should be "Speed is Ki(lli)ng".

@ Mark Dickinson

I don't write C but Gay's 4300 line dtoa.c is the most frightening piece of code I've ever seen. 

I think accountants have the right idea: money is in cents, with positives in the left column and negatives in the right. All they need is integer addition and comparison. Multiplication? Only if you have taxation.

Derek O'Connor
History
Date User Action Args
2010-04-04 19:45:31derekroconnorsetmessages: + msg102348
2010-04-04 14:11:43mark.dickinsonsetmessages: + msg102344
2010-04-04 13:48:52tim.peterssetnosy: + tim.peters
messages: + msg102343
2010-04-04 09:59:11mark.dickinsonsetmessages: + msg102330
2010-04-04 09:44:50derekroconnorsetmessages: + msg102329
2010-04-04 08:17:39mark.dickinsonsetstatus: open -> closed
resolution: wont fix
messages: + msg102324
2010-04-04 05:41:55eric.smithsetnosy: + eric.smith
type: performance -> behavior
2010-04-04 04:36:19isandlersetnosy: + isandler
messages: + msg102319
2010-04-04 01:05:57ezio.melottisetnosy: + mark.dickinson

stage: test needed
2010-04-04 01:04:32derekroconnorcreate