math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD #75811

serhiy-storchaka · 2017-09-29T08:50:28Z

BPO	31630
Nosy	@tim-one, @rhettinger, @mdickinson, @skrah, @serhiy-storchaka, @jeff5, @applio
PRs	bpo-31630: Skip tests for tan() near pi/2 on OpenBSD and NetBSD. #4169
Files	tanny.py tanny-openbsd.txt: tanny.py output on OpenBSD 6.1 tan_pi2.c

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2017-09-29.08:50:27.525>
labels = ['3.7', '3.8', 'type-bug', 'tests']
title = 'math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD'
updated_at = <Date 2018-07-09.08:49:42.357>
user = 'https://github.com/serhiy-storchaka'

bugs.python.org fields:

activity = <Date 2018-07-09.08:49:42.357>
actor = 'serhiy.storchaka'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Tests']
creation = <Date 2017-09-29.08:50:27.525>
creator = 'serhiy.storchaka'
dependencies = []
files = ['47183', '47184', '47245']
hgrepos = []
issue_num = 31630
keywords = ['patch']
message_count = 30.0
messages = ['303311', '303472', '303477', '303478', '303479', '303484', '303497', '303556', '303559', '303574', '304488', '305188', '305189', '305190', '305191', '305328', '305330', '305333', '305335', '305390', '305391', '305396', '305398', '305436', '306478', '306480', '306525', '306531', '306558', '321302']
nosy_count = 7.0
nosy_names = ['tim.peters', 'rhettinger', 'mark.dickinson', 'skrah', 'serhiy.storchaka', 'jeff.allen', 'davin']
pr_nums = ['4169']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue31630'
versions = ['Python 2.7', 'Python 3.6', 'Python 3.7', 'Python 3.8']

serhiy-storchaka · 2017-09-29T08:50:27Z

test_math and test_cmath fail on OpenBSD 6.1.

======================================================================
FAIL: test_testfile (test.test_math.MathTests)
----------------------------------------------------------------------

Traceback (most recent call last):
  File "/home/serhiy/py/cpython3.7/Lib/test/test_math.py", line 1349, in test_testfile
    '\n  '.join(failures))
AssertionError: Failures in test_testfile:
  tan0064: tan(1.5707963267948961): expected 1978937966095219.0, got 1978945885716843.0 (error = 7.92e+09 (31678486496 ulps); permitted error = 0 or 5 ulps)

----------------------------------------------------------------------

======================================================================
FAIL: test_specific_values (test.test_cmath.CMathTests)
----------------------------------------------------------------------

Traceback (most recent call last):
  File "/home/serhiy/py/cpython3.7/Lib/test/test_cmath.py", line 418, in test_specific_values
    msg=error_message)
  File "/home/serhiy/py/cpython3.7/Lib/test/test_cmath.py", line 149, in rAssertAlmostEqual
    '{!r} and {!r} are not sufficiently close'.format(a, b))
AssertionError: tan0064: tan(complex(1.5707963267948961, 0.0))
Expected: complex(1978937966095219.0, 0.0)
Received: complex(1978945885716843.0, 0.0)
Received value insufficiently close to expected value.

Similar issue is bpo-27953.

rhettinger · 2017-10-01T17:53:44Z

I would have thought this would be a straight pass-through to the underlying hardware FPU tan() instruction so that it would give the same answer for the same hardware regardless of O/S.

Perhaps we're seeing a software-only implementation (using CORDIC or somesuch). If so, inputs close to a singularity are the place where the implementation is likely to fall apart (the relative error in this example is 1.000004001955473).

mdickinson · 2017-10-01T18:31:01Z

I'm fairly sure most modern OSs on x86-64 use software implementations of sin, cos and tan. At least, I hope so: the x87 hardware versions are notoriously inaccurate:

https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/

mdickinson · 2017-10-01T18:32:39Z

So actually, the cause of the inaccuracy here could be that OpenBSD _is_ using the hardware instructions ...

tim-one · 2017-10-01T19:02:35Z

Of course the relationship is extremely delicate near pi/2. On my Windows Python 3:

>>> import math
>>> (1.5707963267948961).hex()
'0x1.921fb54442d16p+0'
>>> math.tan(float.fromhex('0x1.921fb54442d16p+0')) # what the test expects
1978937966095219.0
>>> math.tan(float.fromhex('0x1.921fb54442d15p+0')) # input 1 ulp less
1374823386397210.2
>>> math.tan(float.fromhex('0x1.921fb54442d17p+0')) # input 1 ulp more
3530114321217157.5

Interestingly, wxMaxima on the same box reproduces the OpenBSD result:

(%i1) tan(1.5707963267948961);
(%o1) 1.978945885716843*10^15

But I don't know how Maxima (or OpenBSD) implement tan().

skrah · 2017-10-01T19:17:31Z

OpenBSD reduces to the range [-pi/4,pi/4]:

https://github.com/openbsd/src/blob/master/lib/libm/src/s_tan.c

According to the man page on i386:

http://man.openbsd.org/FreeBSD-11.0/math.3

"Argument reduction is not performed accurately for huge arguments, resulting in large errors for such arguments to cos(), sin(), and tan()."

pi/2 is not exactly "huge", but it would be interesting if this
occurred on i386.

tim-one · 2017-10-01T21:47:59Z

Ah! So it looks like OpenBSD took its math functions from Sun's fdlibm. I believe wxMaxima does too. That would certainly explain why they give the same answer ;-)

So who's right? Using "bigfloats" in wxMaxima and feeding its bigfloat tan() the exact decimal value corresponding to the binary double denoted by 1.5707963267948961, it reproduces the result the test is expecting. Ditto by writing my own tan function, using decimal with 50 digits, using the straightforward series expansions for sin and cos, and then dividing to get tan (slow, but without any attempt at "cleverness" it's easy to trust it).

So if fdlibm's author were still active, I'm sure he'd consider this to be a bug - it _intended_ to have < 1 ulp error in almost all cases.

It's not argument reduction that's hurting it. fdlibm stores pi/2 to waaaay more bits than are needed to get the right answer via -1/tan(x - pi/2).

Regardless, there's nothing Python can do about it short of changing the test. As the Python results I pasted before show, the OpenBSD answer _is_ correct for some (non-representable in double precision) input value within 1 ulp of the actual input value. So it's a very demanding test to get right.

tim-one · 2017-10-02T18:39:22Z

If someone opens a bug report with OpenBSD, or just for us to get more info, it could be useful to have a larger universe of troublesome tan inputs to stare at. So the attached tanny.py supplies them, testing all inputs within 100 ulps of math.pi/2 (or change N=100 to whatever you like).

There are no failures on my 64-bit Win10 3.6.1.

The "correct" answers are computed by using mpmath.tan() set to 200 mantissa bits, then rounding back to 53 bits. These all match the results from my dirt-dumb decimal tan() rounded back, but it's better to trust a widely-used package.

Of course mpmath needs to be installed ("pip install mpmath" works fine):

http://mpmath.org/

Nothing else is needed (while mpmath will exploit gmpy if it's installed, by default it sticks to pure Python code).

tim-one · 2017-10-02T19:06:54Z

Thanks for tanny-openbsd.txt, Serhiy! OpenBSD didn't get anywhere close to the best answer on any of those 201 inputs. I was hoping we could, e.g., test something a little more removed from pi/2 - but even its best cases in this range are hundreds of millions of ulps off :-(

tim-one · 2017-10-03T00:23:28Z

When Sun was developing fdlibm, I was (among other things) working on a proprietary libm for Kendall Square Research. I corresponded with fdlibm's primary author (KC Ng) often at the time. There's no way he would have left errors this egregious slip by, or neglect careful testing near singularities.

But writing portable code at this level is very delicate, and fdlibm failed in many ways when I first compiled it for the KSR machine. That was the first 64-bit box fdlibm was tried on, & it exposed many unwarranted assumptions in the code.

Browsing the OpenBSD port, they bumped into more, seemingly driven by gcc violating assumptions about how tricky casting would work - there are many places in the code that, e.g., want to view a single double as a pair of int32s (or vice versa), at the bit level. An aggressive compiler can easily be provoked into reordering operations in fatally damaging ways, if it doesn't realize that, under the maze of cast and union tricks, successive operations are actually operating on the _same_ bits.

So, if possible, it would be great to try compiling OpenBSD's libm with every conceivable optimization disabled, to see if that changes results.

Why I suspect that may be promising: as Stefan pointed out, for arguments near pi/2 fdlibm's tan() subtracts pi/2 and evaluates the result as tan(x) = -1/tan(x - pi/2). But these arguments are _so_ close to pi/2 that |x - pi/2| is "tiny". But for tiny arguments z, tan(z) == z to machine precision (tan(z) = sin(z) / cos(z) ~= z/1 = z for tiny |z|). So it's effectively using tan(x) = -1/(x - pi/2) in this range.

In the original failing, test, behold:

>>> (1.0 / 1978937966095219.0).hex() # correct
'0x1.234c4c6628b81p-51'

>>> (1.0 / 1978945885716843.0).hex() # OpenBSD
'0x1.234c000000000p-51'

It's very hard to believe all those trailing zero bits in the OpenBSD reciprocal are a coincidence. It's extremely hard to believe that fdlibm _intended_ them to be all zeroes. But it's pretty easy to believe an aggressive optimizer managed to lose the intent of the code.

tim-one · 2017-10-17T02:01:32Z

On 16 Oct 2017, exactly the same test failures were reported on python-dev:

https://mail.python.org/pipermail/python-dev/2017-October/149880.html

From the test output posted there:

"""
== CPython 3.6.3 (default, Oct 16 2017, 14:42:21) [GCC 4.7.2]
== Linux-3.2.0-4-686-pae-i686-with-debian-7.11 little-endian
"""

And I just noticed that in bpo-27953 exactly the same bad tan result() was reported (on "OS X Tiger").

It's hard to believe that three different systems (four, if I include wxMaxima) deliver exactly the same bad result by accident.

serhiy-storchaka · 2017-10-29T19:30:41Z

There is the same result on NetBSD 7.1.

tim-one · 2017-10-29T20:18:20Z

BTW, has anyone tried running a tiny C program on these platforms to see what tan(1.5707963267948961) delivers? The kind of code fdlibm uses is sensitive not only to compiler (mis)optimization, but also to stuff like how the FPU's "precision control" is set, and to whether fused multiply-add is being used, and ...

skrah · 2017-10-29T20:22:33Z

Also, does this occur in a VM on on the bare metal or both?

What leaves me puzzled is that I cannot reproduce the Linux x86 report
with almost the exact same compiler. But I'm on x64 and use -m32.

serhiy-storchaka · 2017-10-29T20:58:34Z

I have got both (!) results in the same binary on NetBSD (gcc 4.8.5).

tan(1.57079632679489611) = 1978937966095219.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f559a01adccp+50
tan(1.57079632679489611) = 1978945885716843.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f5cfa3105acp+50

Seems the first result is calculated at compile time while the second result is calculated at run time.

On OpenBSD (gcc 4.2.1):

tan(1.57079632679489611) = 1978945885716843.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f5cfa3105acp+50
tan(1.57079632679489611) = 1978945885716843.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f5cfa3105acp+50

On Linux (gcc 7.2.0) and FreeBSD (clang 4.0.0):

tan(1.57079632679489611) = 1978937966095219.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f559a01adccp+50
tan(1.57079632679489611) = 1978937966095219.000000
tan(0x1.921fb54442d16p+0) = 0x1.c1f559a01adccp+50

mdickinson · 2017-10-31T18:53:44Z

Seems the first result is calculated at compile time

Makes sense. Last time I looked, gcc uses MPFR for the compile-time calls, so I'd expect those to be correctly rounded.

skrah · 2017-10-31T19:14:34Z

So the big mystery is still:

https://mail.python.org/pipermail/python-dev/2017-October/149880.html

Could be a Linux router with some alternative libc ...

serhiy-storchaka · 2017-10-31T19:32:01Z

From 4 considered results the tests are failed on gcc 4.2.1, 4.7.2, 4.8.5, but are passes on gcc 7.2.0. I suppose this is gcc or libc bug fixed in recent versions.

skrah · 2017-10-31T19:43:00Z

On Tue, Oct 31, 2017 at 07:32:00PM +0000, Serhiy Storchaka wrote:

>>From 4 considered results the tests are failed on gcc 4.2.1, 4.7.2, 4.8.5, but are passes on gcc 7.2.0. I suppose this is gcc or libc bug fixed in recent versions.

4.7.3 passes here with -m32.

mdickinson · 2017-11-01T18:17:44Z

The big mystery for me is not "Why this is occurring in the first place?" but "What should we do about it?"

I'm really reluctant to (even conditionally) skip the test, because it's doing exactly what it's designed to do, namely detecting and reporting that Python is giving poor results in this corner case on this platform. As developers, we can blame the poor results on the platform's libm, but that doesn't help the user.

If we can find combinations of compiler flags that fix the issue for this platform, then I guess that's a solution. But what if we can't? Do we simply accept that the test suite will fail on this platform (and perhaps add a visible comment somewhere that this is a known issue, with a link to this bpo issue)?

We can certainly hack together a tan implementation that behaves better in these corner cases; it's not totally trivial, but not all that hard, either (I suspect that simply using sin / cos would already be an improvement.) But if we do this every time we encounter a platform-specific libm corner case, we'll end up with a maze of twisty windy workarounds in mathmodule.c ...

Or we could drop the fine-grained math library tests altogether: if the purpose of the math module is to wrap the platform libm and return the results it's returning, then we're already doing that perfectly well, so the tests are at best useless, and at worst problematic in cases like this. But I'm not so happy with that solution either.

Thoughts?

skrah · 2017-11-01T18:47:12Z

On Wed, Nov 01, 2017 at 06:17:44PM +0000, Mark Dickinson wrote:

I'm really reluctant to (even conditionally) skip the test, because it's doing exactly what it's designed to do, namely detecting and reporting that Python is giving poor results in this corner case on this platform. As developers, we can blame the poor results on the platform's libm, but that doesn't help the user.

I would want the tests to fail, especially if it can occur on Linux (though I
still suspect an i686 router in the reported case).

Perhaps hide them behind -uall, so that *we* see the result but just
executing ./python -m test passes.

tim-one · 2017-11-01T20:42:47Z

Since fdlibm uses tan(x) ~= -1/(x-pi/2) in this range, and the reciprocals of the bad results have a whole of bunch of trailing zero bits, my guess is that argument reduction (the "x-pi/2" part) is screwing up (losing bits of pi/2 beyond the long string of initial bits that cancel out). In which case it's likely sin(x) will return a poor result too, for the same reason (while cos(x) will return 1.0). IOW, I'd be surprised if sin(x)/cos(x) were materially better. If so, it's not just tan() that's flawed. That can be checked easily enough (e.g., change the Python program I posted to use sin() instead of tan()).

Regardless, assuming we don't want to write our own libm, it's highly desirable that platform experts be aware of the flaw(s) on platforms where it occurs. Otherwise they'll never make the noise needed to get it fixed.

OTOH, it's not an error _in_ Python if we don't supply libm, so the Python test suite really shouldn't fail on these. Spray warnings to stderr? Create a new "platform (lack of) quality" class of soft failure? "Pass or fail, period" misses the mark here :-(

tim-one · 2017-11-01T21:36:09Z

Oops! I mixed up sin and cos in that comment. If it's argument reduction that's broken, then for x near pi/2 cos(x) will be evaluated as -sin(x - pi/2), which is approximately -(x - pi/2), and so error in argument reduction (the "x - pi/2" part) will show up directly in the cos() result. So to confirm or refute that, you could replace tan by cos in the Python program I posted.

mdickinson · 2017-11-02T16:04:25Z

I'd be surprised if sin(x)/cos(x) were materially better.

Yep. I made the same assumption as you, but then "realised" that to get to the tan tests, we must already have passed all the cos tests, so cos must be okay.

I thought I'd written cos tests near pi/2 as well as tan ones, but unfortunately it turns out that's not true. My bad. Those tests should probably be added to cmath_testcases.txt.

skrah · 2017-11-18T10:13:30Z

Tim has mentioned the high quality of fdlibm, and indeed I cannot
reproduce the issue:

wget -r http://www.netlib.org/fdlibm/

Then build libm with or without optimizations, with or without -m32,
gcc or clang.

Then compile a C program that calls tan(1.5707963267948961).

This is a bit treacherous, because gcc apparently has a builtin
tan(), so one needs to make sure that it actually uses fdlibm.

gcc-4.7 -fno-builtin -O3 -m32 -Wall -W -o xxx xxx.c libm.a

All is fine here (Ubuntu Linux).

skrah · 2017-11-18T13:36:32Z

I found an unused i386 box with OpenBSD and Linux (so no VM):

A C program with tan(1.5707963267948961) is wrong on both systems.

fdlibm (directly from netlib.org) is correct on both systems.

Both OS versions are relatively old, so I cannot file an upstream issue.

I was planning to dig through the fdlibm code if there was an issue,
but it produces the correct results.

tim-one · 2017-11-20T03:56:19Z

Best I can tell, the fdlibm 5.3 on netlib was released in 2002, and essentially stopped existing as a maintained project then. Everyone else copied the source code, and made their own changes independently ever since :-( At least the folks behind the Julia language have made some effort to resurrect it as its own project:

http://openlibm.org/

Mark noted that GCC does use a different math library for tan() calls it can evaluate at compile-time. That appears to have started in gcc 4.3:

https://gcc.gnu.org/gcc-4.3/changes.html#mpfropts

In any case ... are there are any test failures here on a _current_ OS/platform? If it's only on out-of-date platforms, I'd be content to just suppress the failures on those. Unless the number of affected systems is so large that their identifiers won't fit in a file ;-)

serhiy-storchaka · 2017-11-20T07:58:37Z

Do you consider the recent stable releases of OpenBSD 6.1 (April 11, 2017) and NetBSD 7.1 (March 11, 2017) out-of-date platforms? The develop version of NetBSD 8 has the same failure.

tim-one · 2017-11-20T15:41:51Z

I have no opinion about any version of xxxBSD, because I've never used one ;-) If current versions of those do have this failure, has anyone opened a bug report on _their_ tracker(s)? I've seen no reason yet to imagine these failures are a fault in Python.

serhiy-storchaka · 2018-07-09T08:49:42Z

After fixing platform.libc_ver() in bpo-26544 we can use it for conditional skipping this assertion depending on the libc version.

encukou · 2024-04-10T11:01:58Z

Hello from 2024!
Per PEP-11, OpenBSD and NetBSD are not supported platforms.
Generally we try to be friendly to people porting to these architectures and supporting the port, but a test failing because of a libm bug is something to be patched or ignored downstream -- or ideally fixed in the libm. (Maybe it already is, didn't check.)

I'll close the bug and PR.
Like other issues for unsupported platforms, it'll be listed on the GitHub project for anyone who wants to support the platform unofficially or bring support back.
I encourage such people to add themselves to the Experts list so we know whom to ping.

serhiy-storchaka added extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error labels Sep 29, 2017

serhiy-storchaka added 3.7 (EOL) end of life tests Tests in the Lib/test dir and removed extension-modules C modules in the Modules dir labels Oct 29, 2017

serhiy-storchaka changed the title ~~math.tan has poor accuracy near pi/2 on OpenBSD~~ math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD Oct 29, 2017

serhiy-storchaka added the 3.8 only security fixes label Jul 9, 2018

ezio-melotti transferred this issue from another repository Apr 10, 2022

encukou closed this as not planned Won't fix, can't repro, duplicate, stale Apr 10, 2024

encukou added the OS-unsupported label Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD #75811

math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD #75811

serhiy-storchaka commented Sep 29, 2017

serhiy-storchaka commented Sep 29, 2017

rhettinger commented Oct 1, 2017

mdickinson commented Oct 1, 2017

mdickinson commented Oct 1, 2017

tim-one commented Oct 1, 2017

skrah mannequin commented Oct 1, 2017

tim-one commented Oct 1, 2017

tim-one commented Oct 2, 2017

tim-one commented Oct 2, 2017

tim-one commented Oct 3, 2017

tim-one commented Oct 17, 2017

serhiy-storchaka commented Oct 29, 2017

tim-one commented Oct 29, 2017

skrah mannequin commented Oct 29, 2017

serhiy-storchaka commented Oct 29, 2017

mdickinson commented Oct 31, 2017

skrah mannequin commented Oct 31, 2017

serhiy-storchaka commented Oct 31, 2017

skrah mannequin commented Oct 31, 2017

mdickinson commented Nov 1, 2017

skrah mannequin commented Nov 1, 2017

tim-one commented Nov 1, 2017

tim-one commented Nov 1, 2017

mdickinson commented Nov 2, 2017

skrah mannequin commented Nov 18, 2017

skrah mannequin commented Nov 18, 2017

tim-one commented Nov 20, 2017

serhiy-storchaka commented Nov 20, 2017

tim-one commented Nov 20, 2017

serhiy-storchaka commented Jul 9, 2018

encukou commented Apr 10, 2024

math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD #75811

math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD #75811

Comments

serhiy-storchaka commented Sep 29, 2017

serhiy-storchaka commented Sep 29, 2017

rhettinger commented Oct 1, 2017

mdickinson commented Oct 1, 2017

mdickinson commented Oct 1, 2017

tim-one commented Oct 1, 2017

skrah mannequin commented Oct 1, 2017

tim-one commented Oct 1, 2017

tim-one commented Oct 2, 2017

tim-one commented Oct 2, 2017

tim-one commented Oct 3, 2017

tim-one commented Oct 17, 2017

serhiy-storchaka commented Oct 29, 2017

tim-one commented Oct 29, 2017

skrah mannequin commented Oct 29, 2017

serhiy-storchaka commented Oct 29, 2017

mdickinson commented Oct 31, 2017

skrah mannequin commented Oct 31, 2017

serhiy-storchaka commented Oct 31, 2017

skrah mannequin commented Oct 31, 2017

mdickinson commented Nov 1, 2017

skrah mannequin commented Nov 1, 2017

tim-one commented Nov 1, 2017

tim-one commented Nov 1, 2017

mdickinson commented Nov 2, 2017

skrah mannequin commented Nov 18, 2017

skrah mannequin commented Nov 18, 2017

tim-one commented Nov 20, 2017

serhiy-storchaka commented Nov 20, 2017

tim-one commented Nov 20, 2017

serhiy-storchaka commented Jul 9, 2018

encukou commented Apr 10, 2024