classification
Title: math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD
Type: behavior Stage: patch review
Components: Tests Versions: Python 3.8, Python 3.7, Python 3.6, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: davin, jeff.allen, mark.dickinson, rhettinger, serhiy.storchaka, skrah, tim.peters
Priority: normal Keywords: patch

Created on 2017-09-29 08:50 by serhiy.storchaka, last changed 2018-07-09 08:49 by serhiy.storchaka.

Files
File name Uploaded Description Edit
tanny.py tim.peters, 2017-10-02 18:39
tanny-openbsd.txt serhiy.storchaka, 2017-10-02 18:57 tanny.py output on OpenBSD 6.1
tan_pi2.c serhiy.storchaka, 2017-10-29 20:58
Pull Requests
URL Status Linked Edit
PR 4169 open serhiy.storchaka, 2017-10-29 20:12
Messages (30)
msg303311 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-29 08:50
test_math and test_cmath fail on OpenBSD 6.1.

======================================================================
FAIL: test_testfile (test.test_math.MathTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython3.7/Lib/test/test_math.py", line 1349, in test_testfile
    '\n  '.join(failures))
AssertionError: Failures in test_testfile:
  tan0064: tan(1.5707963267948961): expected 1978937966095219.0, got 1978945885716843.0 (error = 7.92e+09 (31678486496 ulps); permitted error = 0 or 5 ulps)

----------------------------------------------------------------------

======================================================================
FAIL: test_specific_values (test.test_cmath.CMathTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython3.7/Lib/test/test_cmath.py", line 418, in test_specific_values
    msg=error_message)
  File "/home/serhiy/py/cpython3.7/Lib/test/test_cmath.py", line 149, in rAssertAlmostEqual
    '{!r} and {!r} are not sufficiently close'.format(a, b))
AssertionError: tan0064: tan(complex(1.5707963267948961, 0.0))
Expected: complex(1978937966095219.0, 0.0)
Received: complex(1978945885716843.0, 0.0)
Received value insufficiently close to expected value.

----------------------------------------------------------------------

Similar issue is issue27953.
msg303472 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-10-01 17:53
I would have thought this would be a straight pass-through to the underlying hardware FPU tan() instruction so that it would give the same answer for the same hardware regardless of O/S.

Perhaps we're seeing a software-only implementation (using CORDIC or somesuch).  If so, inputs close to a singularity are the place where the implementation is likely to fall apart (the relative error in this example is 1.000004001955473).

* https://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html
* https://stackoverflow.com/questions/16880376/sin-cos-tan-not-accurate
* http://code.activestate.com/recipes/576792-polar-to-rectangular-conversions-using-cordic/
msg303477 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-10-01 18:31
I'm fairly sure most modern OSs on x86-64 use software implementations of sin, cos and tan. At least, I hope so: the x87 hardware versions are notoriously inaccurate:

https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/
msg303478 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-10-01 18:32
So actually, the cause of the inaccuracy here could be that OpenBSD _is_ using the hardware instructions ...
msg303479 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-10-01 19:02
Of course the relationship is extremely delicate near pi/2.  On my Windows Python 3:

>>> import math
>>> (1.5707963267948961).hex()
'0x1.921fb54442d16p+0'
>>> math.tan(float.fromhex('0x1.921fb54442d16p+0')) # what the test expects
1978937966095219.0
>>> math.tan(float.fromhex('0x1.921fb54442d15p+0')) # input 1 ulp less
1374823386397210.2
>>> math.tan(float.fromhex('0x1.921fb54442d17p+0')) # input 1 ulp more
3530114321217157.5

Interestingly, wxMaxima on the same box reproduces the OpenBSD result:

(%i1) tan(1.5707963267948961);
(%o1) 1.978945885716843*10^15

But I don't know how Maxima (or OpenBSD) implement tan().
msg303484 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-10-01 19:17
OpenBSD reduces to the range [-pi/4,pi/4]:

   https://github.com/openbsd/src/blob/master/lib/libm/src/s_tan.c


According to the man page on i386:

   http://man.openbsd.org/FreeBSD-11.0/math.3

"Argument reduction is not performed accurately for huge arguments, resulting in large errors for such arguments to cos(), sin(), and tan()."


pi/2 is not exactly "huge", but it would be interesting if this
occurred on i386.
msg303497 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-10-01 21:47
Ah!  So it looks like OpenBSD took its math functions from Sun's fdlibm.  I believe wxMaxima does too.  That would certainly explain why they give the same answer ;-)

So who's right?  Using "bigfloats" in wxMaxima and feeding its bigfloat tan() the exact decimal value corresponding to the binary double denoted by 1.5707963267948961, it reproduces the result the test is expecting.  Ditto by writing my own tan function, using `decimal` with 50 digits, using the straightforward series expansions for sin and cos, and then dividing to get tan (slow, but without any attempt at "cleverness" it's easy to trust it).

So if fdlibm's author were still active, I'm sure he'd consider this to be a bug - it _intended_ to have < 1 ulp error in almost all cases.

It's not argument reduction that's hurting it.  fdlibm stores pi/2 to waaaay more bits than are needed to get the right answer via -1/tan(x - pi/2).  

Regardless, there's nothing Python can do about it short of changing the test.  As the Python results I pasted before show, the OpenBSD answer _is_ correct for some (non-representable in double precision) input value within 1 ulp of the actual input value.  So it's a very demanding test to get right.
msg303556 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-10-02 18:39
If someone opens a bug report with OpenBSD, or just for us to get more info, it could be useful to have a larger universe of troublesome tan inputs to stare at.  So the attached tanny.py supplies them, testing all inputs within 100 ulps of math.pi/2 (or change N=100 to whatever you like).

There are no failures on my 64-bit Win10 3.6.1.

The "correct" answers are computed by using mpmath.tan() set to 200 mantissa bits, then rounding back to 53 bits.  These all match the results from my dirt-dumb decimal tan() rounded back, but it's better to trust a widely-used package.

Of course mpmath needs to be installed ("pip install mpmath" works fine):

    http://mpmath.org/

Nothing else is needed (while mpmath will exploit gmpy if it's installed, by default it sticks to pure Python code).
msg303559 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-10-02 19:06
Thanks for tanny-openbsd.txt, Serhiy!  OpenBSD didn't get anywhere close to the best answer on any of those 201 inputs.  I was hoping we could, e.g., test something a little more removed from pi/2 - but even its best cases in this range are hundreds of millions of ulps off :-(
msg303574 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-10-03 00:23
When Sun was developing fdlibm, I was (among other things) working on a proprietary libm for Kendall Square Research.  I corresponded with fdlibm's primary author (KC Ng) often at the time.  There's no way he would have left errors this egregious slip by, or neglect careful testing near singularities.

But writing portable code at this level is very delicate, and fdlibm failed in many ways when I first compiled it for the KSR machine.  That was the first 64-bit box fdlibm was tried on, & it exposed many unwarranted assumptions in the code.

Browsing the OpenBSD port, they bumped into more, seemingly driven by gcc violating assumptions about how tricky casting would work - there are many places in the code that, e.g., want to view a single double as a pair of int32s (or vice versa), at the bit level.  An aggressive compiler can easily be provoked into reordering operations in fatally damaging ways, if it doesn't realize that, under the maze of cast and union tricks, successive operations are actually operating on the _same_ bits.

So, if possible, it would be great to try compiling OpenBSD's libm with every conceivable optimization disabled, to see if that changes results.

Why I suspect that may be promising:  as Stefan pointed out, for arguments near pi/2 fdlibm's tan() subtracts pi/2 and evaluates the result as tan(x) = -1/tan(x - pi/2).  But these arguments are _so_ close to pi/2 that |x - pi/2| is "tiny".  But for tiny arguments z, tan(z) == z to machine precision (tan(z) = sin(z) / cos(z) ~= z/1 = z for tiny |z|).  So it's effectively using tan(x) = -1/(x - pi/2) in this range.

In the original failing, test, behold:

>>> (1.0 / 1978937966095219.0).hex() # correct
'0x1.234c4c6628b81p-51'

>>> (1.0 / 1978945885716843.0).hex() # OpenBSD
'0x1.234c000000000p-51'

It's very hard to believe all those trailing zero bits in the OpenBSD reciprocal are a coincidence.  It's extremely hard to believe that fdlibm _intended_ them to be all zeroes.  But it's pretty easy to believe an aggressive optimizer managed to lose the intent of the code.
msg304488 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-10-17 02:01
On 16 Oct 2017, exactly the same test failures were reported on python-dev:

https://mail.python.org/pipermail/python-dev/2017-October/149880.html

From the test output posted there:

"""
== CPython 3.6.3 (default, Oct 16 2017, 14:42:21) [GCC 4.7.2]
== Linux-3.2.0-4-686-pae-i686-with-debian-7.11 little-endian
"""

And I just noticed that in issue27953 exactly the same bad tan result() was reported (on "OS X Tiger").

It's hard to believe that three different systems (four, if I include wxMaxima) deliver exactly the same bad result by accident.
msg305188 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-29 19:30
There is the same result on NetBSD 7.1.
msg305189 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-10-29 20:18
BTW, has anyone tried running a tiny C program on these platforms to see what tan(1.5707963267948961) delivers?  The kind of code fdlibm uses is sensitive not only to compiler (mis)optimization, but also to stuff like how the FPU's "precision control" is set, and to whether fused multiply-add is being used, and ...
msg305190 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-10-29 20:22
Also, does this occur in a VM on on the bare metal or both?

What leaves me puzzled is that I cannot reproduce the Linux x86 report
with almost the exact same compiler. But I'm on x64 and use -m32.
msg305191 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-29 20:58
I have got both (!) results in the same binary on NetBSD (gcc 4.8.5).

tan(1.57079632679489611) = 1978937966095219.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f559a01adccp+50
tan(1.57079632679489611) = 1978945885716843.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f5cfa3105acp+50

Seems the first result is calculated at compile time while the second result is calculated at run time.

On OpenBSD (gcc 4.2.1):

tan(1.57079632679489611) = 1978945885716843.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f5cfa3105acp+50
tan(1.57079632679489611) = 1978945885716843.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f5cfa3105acp+50

On Linux (gcc 7.2.0) and FreeBSD (clang 4.0.0):

tan(1.57079632679489611) = 1978937966095219.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f559a01adccp+50
tan(1.57079632679489611) = 1978937966095219.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f559a01adccp+50
msg305328 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-10-31 18:53
> Seems the first result is calculated at compile time

Makes sense. Last time I looked, gcc uses MPFR for the compile-time calls, so I'd expect those to be correctly rounded.
msg305330 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-10-31 19:14
So the big mystery is still:

   https://mail.python.org/pipermail/python-dev/2017-October/149880.html


Could be a Linux router with some alternative libc ...
msg305333 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-10-31 19:32
From 4 considered results the tests are failed on gcc 4.2.1, 4.7.2, 4.8.5, but are passes on gcc 7.2.0. I suppose this is gcc or libc bug fixed in recent versions.
msg305335 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-10-31 19:42
On Tue, Oct 31, 2017 at 07:32:00PM +0000, Serhiy Storchaka wrote:
> >>From 4 considered results the tests are failed on gcc 4.2.1, 4.7.2, 4.8.5, but are passes on gcc 7.2.0. I suppose this is gcc or libc bug fixed in recent versions.

4.7.3 passes here with -m32.
msg305390 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-11-01 18:17
The big mystery for me is not "Why this is occurring in the first place?" but "What should we do about it?"

I'm really reluctant to (even conditionally) skip the test, because it's doing exactly what it's designed to do, namely detecting and reporting that Python is giving poor results in this corner case on this platform. As developers, we can blame the poor results on the platform's libm, but that doesn't help the user.

If we can find combinations of compiler flags that fix the issue for this platform, then I guess that's a solution. But what if we can't? Do we simply accept that the test suite will fail on this platform (and perhaps add a visible comment somewhere that this is a known issue, with a link to this bpo issue)?

We can certainly hack together a tan implementation that behaves better in these corner cases; it's not totally trivial, but not all that hard, either (I suspect that simply using `sin / cos` would already be an improvement.) But if we do this every time we encounter a platform-specific libm corner case, we'll end up with a maze of twisty windy workarounds in mathmodule.c ...

Or we could drop the fine-grained math library tests altogether: if the purpose of the math module is to wrap the platform libm and return the results it's returning, then we're already doing that perfectly well, so the tests are at best useless, and at worst problematic in cases like this. But I'm not so happy with that solution either.

Thoughts?
msg305391 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-11-01 18:47
On Wed, Nov 01, 2017 at 06:17:44PM +0000, Mark Dickinson wrote:
> I'm really reluctant to (even conditionally) skip the test, because it's doing exactly what it's designed to do, namely detecting and reporting that Python is giving poor results in this corner case on this platform. As developers, we can blame the poor results on the platform's libm, but that doesn't help the user.

I would want the tests to fail, especially if it can occur on Linux (though I
still suspect an i686 router in the reported case).

Perhaps hide them behind -uall, so that *we* see the result but just
executing ./python -m test passes.
msg305396 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-11-01 20:42
Since fdlibm uses tan(x) ~= -1/(x-pi/2) in this range, and the reciprocals of the bad results have a whole of bunch of trailing zero bits, my guess is that argument reduction (the "x-pi/2" part) is screwing up (losing bits of pi/2 beyond the long string of initial bits that cancel out).  In which case it's likely sin(x) will return a poor result too, for the same reason (while cos(x) will return 1.0).  IOW, I'd be surprised if sin(x)/cos(x) were materially better.  If so, it's not just tan() that's flawed.  That can be checked easily enough (e.g., change the Python program I posted to use sin() instead of tan()).

Regardless, assuming we don't want to write our own libm, it's highly desirable that platform experts be aware of the flaw(s) on platforms where it occurs.  Otherwise they'll never make the noise needed to get it fixed.

OTOH, it's not an error _in_ Python if we don't supply libm, so the Python test suite really shouldn't fail on these.  Spray warnings to stderr?  Create a new "platform (lack of) quality" class of soft failure?  "Pass or fail, period" misses the mark here :-(
msg305398 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-11-01 21:36
Oops!  I mixed up `sin` and `cos` in that comment.  If it's argument reduction that's broken, then for x near pi/2 cos(x) will be evaluated as -sin(x - pi/2), which is approximately -(x - pi/2), and so error in argument reduction (the "x - pi/2" part) will show up directly in the cos() result.  So to confirm or refute that, you could replace `tan` by `cos` in the Python program I posted.
msg305436 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-11-02 16:04
> I'd be surprised if sin(x)/cos(x) were materially better.

Yep. I made the same assumption as you, but then "realised" that to get to the tan tests, we must already have passed all the cos tests, so cos must be okay. 

I thought I'd written cos tests near pi/2 as well as tan ones, but unfortunately it turns out that's not true. My bad. Those tests should probably be added to cmath_testcases.txt.
msg306478 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-11-18 10:13
Tim has mentioned the high quality of fdlibm, and indeed I cannot
reproduce the issue:

    wget -r http://www.netlib.org/fdlibm/

Then build libm with or without optimizations, with or without -m32,
gcc or clang.


Then compile a C program that calls tan(1.5707963267948961).


This is a bit treacherous, because gcc apparently has a builtin
tan(), so one needs to make sure that it actually uses fdlibm.

gcc-4.7 -fno-builtin -O3 -m32 -Wall -W -o xxx xxx.c libm.a


All is fine here (Ubuntu Linux).
msg306480 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2017-11-18 13:36
I found an unused i386 box with OpenBSD and Linux (so no VM):

A C program with tan(1.5707963267948961) is wrong on both systems.

fdlibm (directly from netlib.org) is correct on both systems.


Both OS versions are relatively old, so I cannot file an upstream issue.

I was planning to dig through the fdlibm code if there was an issue,
but it produces the correct results.
msg306525 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-11-20 03:56
Best I can tell, the fdlibm 5.3 on netlib was released in 2002, and essentially stopped existing as a maintained project then.  Everyone else copied the source code, and made their own changes independently ever since :-(  At least the folks behind the Julia language have made some effort to resurrect it as its own project:

http://openlibm.org/

Mark noted that GCC does use a different math library for tan() calls it can evaluate at compile-time.  That appears to have started in gcc 4.3:

https://gcc.gnu.org/gcc-4.3/changes.html#mpfropts

In any case ... are there are any test failures here on a _current_ OS/platform?  If it's only on out-of-date platforms, I'd be content to just suppress the failures on those.  Unless the number of affected systems is so large that their identifiers won't fit in a file ;-)
msg306531 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-11-20 07:58
Do you consider the recent stable releases of  OpenBSD 6.1 (April 11, 2017) and NetBSD 7.1 (March 11, 2017) out-of-date platforms? The develop version of NetBSD 8 has the same failure.
msg306558 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2017-11-20 15:41
I have no opinion about any version of xxxBSD, because I've never used one ;-)  If current versions of those do have this failure, has anyone opened a bug report on _their_ tracker(s)?  I've seen no reason yet to imagine these failures are a fault in Python.
msg321302 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-07-09 08:49
After fixing platform.libc_ver() in issue26544 we can use it for conditional skipping this assertion depending on the libc version.
History
Date User Action Args
2018-07-09 08:49:42serhiy.storchakasetmessages: + msg321302
versions: + Python 3.8
2017-11-20 15:41:51tim.peterssetmessages: + msg306558
2017-11-20 07:58:37serhiy.storchakasetmessages: + msg306531
2017-11-20 03:56:19tim.peterssetmessages: + msg306525
2017-11-18 13:36:32skrahsetmessages: + msg306480
2017-11-18 10:13:30skrahsetmessages: + msg306478
2017-11-02 16:04:24mark.dickinsonsetmessages: + msg305436
2017-11-01 21:36:08tim.peterssetmessages: + msg305398
2017-11-01 20:42:47tim.peterssetmessages: + msg305396
2017-11-01 18:47:11skrahsetmessages: + msg305391
2017-11-01 18:17:44mark.dickinsonsetmessages: + msg305390
2017-10-31 19:43:00skrahsetmessages: + msg305335
2017-10-31 19:32:00serhiy.storchakasetmessages: + msg305333
2017-10-31 19:14:33skrahsetmessages: + msg305330
2017-10-31 18:53:43mark.dickinsonsetmessages: + msg305328
2017-10-29 20:58:34serhiy.storchakasetfiles: + tan_pi2.c

messages: + msg305191
2017-10-29 20:22:33skrahsetmessages: + msg305190
2017-10-29 20:18:19tim.peterssetmessages: + msg305189
2017-10-29 20:13:57serhiy.storchakasettitle: math.tan has poor accuracy near pi/2 on OpenBSD -> math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD
components: + Tests, - Extension Modules
versions: + Python 2.7, Python 3.6, Python 3.7
2017-10-29 20:12:55serhiy.storchakasetkeywords: + patch
stage: patch review
pull_requests: + pull_request4138
2017-10-29 19:30:40serhiy.storchakasetmessages: + msg305188
2017-10-17 21:25:45jeff.allensetnosy: + jeff.allen
2017-10-17 02:01:33tim.peterssetmessages: + msg304488
2017-10-03 00:23:30tim.peterssetmessages: + msg303574
2017-10-02 19:06:53tim.peterssetmessages: + msg303559
2017-10-02 18:57:53serhiy.storchakasetfiles: + tanny-openbsd.txt
2017-10-02 18:39:23tim.peterssetfiles: + tanny.py

messages: + msg303556
2017-10-01 21:47:59tim.peterssetmessages: + msg303497
2017-10-01 19:17:30skrahsetnosy: + skrah
messages: + msg303484
2017-10-01 19:02:35tim.peterssetmessages: + msg303479
2017-10-01 18:32:38mark.dickinsonsetmessages: + msg303478
2017-10-01 18:31:01mark.dickinsonsetmessages: + msg303477
2017-10-01 17:53:44rhettingersetnosy: + rhettinger, tim.peters
messages: + msg303472
2017-09-29 09:45:36serhiy.storchakasetnosy: + davin
2017-09-29 08:50:27serhiy.storchakacreate