Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD #75811

Closed
serhiy-storchaka opened this issue Sep 29, 2017 · 31 comments
Closed

math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD #75811

serhiy-storchaka opened this issue Sep 29, 2017 · 31 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes OS-unsupported tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

BPO 31630
Nosy @tim-one, @rhettinger, @mdickinson, @skrah, @serhiy-storchaka, @jeff5, @applio
PRs
  • bpo-31630: Skip tests for tan() near pi/2 on OpenBSD and NetBSD. #4169
  • Files
  • tanny.py
  • tanny-openbsd.txt: tanny.py output on OpenBSD 6.1
  • tan_pi2.c
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2017-09-29.08:50:27.525>
    labels = ['3.7', '3.8', 'type-bug', 'tests']
    title = 'math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD'
    updated_at = <Date 2018-07-09.08:49:42.357>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2018-07-09.08:49:42.357>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Tests']
    creation = <Date 2017-09-29.08:50:27.525>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['47183', '47184', '47245']
    hgrepos = []
    issue_num = 31630
    keywords = ['patch']
    message_count = 30.0
    messages = ['303311', '303472', '303477', '303478', '303479', '303484', '303497', '303556', '303559', '303574', '304488', '305188', '305189', '305190', '305191', '305328', '305330', '305333', '305335', '305390', '305391', '305396', '305398', '305436', '306478', '306480', '306525', '306531', '306558', '321302']
    nosy_count = 7.0
    nosy_names = ['tim.peters', 'rhettinger', 'mark.dickinson', 'skrah', 'serhiy.storchaka', 'jeff.allen', 'davin']
    pr_nums = ['4169']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue31630'
    versions = ['Python 2.7', 'Python 3.6', 'Python 3.7', 'Python 3.8']

    @serhiy-storchaka
    Copy link
    Member Author

    test_math and test_cmath fail on OpenBSD 6.1.

    ======================================================================
    FAIL: test_testfile (test.test_math.MathTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/serhiy/py/cpython3.7/Lib/test/test_math.py", line 1349, in test_testfile
        '\n  '.join(failures))
    AssertionError: Failures in test_testfile:
      tan0064: tan(1.5707963267948961): expected 1978937966095219.0, got 1978945885716843.0 (error = 7.92e+09 (31678486496 ulps); permitted error = 0 or 5 ulps)
    
    ----------------------------------------------------------------------
    
    

    ======================================================================
    FAIL: test_specific_values (test.test_cmath.CMathTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/serhiy/py/cpython3.7/Lib/test/test_cmath.py", line 418, in test_specific_values
        msg=error_message)
      File "/home/serhiy/py/cpython3.7/Lib/test/test_cmath.py", line 149, in rAssertAlmostEqual
        '{!r} and {!r} are not sufficiently close'.format(a, b))
    AssertionError: tan0064: tan(complex(1.5707963267948961, 0.0))
    Expected: complex(1978937966095219.0, 0.0)
    Received: complex(1978945885716843.0, 0.0)
    Received value insufficiently close to expected value.

    Similar issue is bpo-27953.

    @serhiy-storchaka serhiy-storchaka added extension-modules C modules in the Modules dir type-bug An unexpected behavior, bug, or error labels Sep 29, 2017
    @rhettinger
    Copy link
    Contributor

    I would have thought this would be a straight pass-through to the underlying hardware FPU tan() instruction so that it would give the same answer for the same hardware regardless of O/S.

    Perhaps we're seeing a software-only implementation (using CORDIC or somesuch). If so, inputs close to a singularity are the place where the implementation is likely to fall apart (the relative error in this example is 1.000004001955473).

    @mdickinson
    Copy link
    Member

    I'm fairly sure most modern OSs on x86-64 use software implementations of sin, cos and tan. At least, I hope so: the x87 hardware versions are notoriously inaccurate:

    https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/

    @mdickinson
    Copy link
    Member

    So actually, the cause of the inaccuracy here could be that OpenBSD _is_ using the hardware instructions ...

    @tim-one
    Copy link
    Member

    tim-one commented Oct 1, 2017

    Of course the relationship is extremely delicate near pi/2. On my Windows Python 3:

    >>> import math
    >>> (1.5707963267948961).hex()
    '0x1.921fb54442d16p+0'
    >>> math.tan(float.fromhex('0x1.921fb54442d16p+0')) # what the test expects
    1978937966095219.0
    >>> math.tan(float.fromhex('0x1.921fb54442d15p+0')) # input 1 ulp less
    1374823386397210.2
    >>> math.tan(float.fromhex('0x1.921fb54442d17p+0')) # input 1 ulp more
    3530114321217157.5

    Interestingly, wxMaxima on the same box reproduces the OpenBSD result:

    (%i1) tan(1.5707963267948961);
    (%o1) 1.978945885716843*10^15

    But I don't know how Maxima (or OpenBSD) implement tan().

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Oct 1, 2017

    OpenBSD reduces to the range [-pi/4,pi/4]:

    https://github.com/openbsd/src/blob/master/lib/libm/src/s_tan.c

    According to the man page on i386:

    http://man.openbsd.org/FreeBSD-11.0/math.3

    "Argument reduction is not performed accurately for huge arguments, resulting in large errors for such arguments to cos(), sin(), and tan()."

    pi/2 is not exactly "huge", but it would be interesting if this
    occurred on i386.

    @tim-one
    Copy link
    Member

    tim-one commented Oct 1, 2017

    Ah! So it looks like OpenBSD took its math functions from Sun's fdlibm. I believe wxMaxima does too. That would certainly explain why they give the same answer ;-)

    So who's right? Using "bigfloats" in wxMaxima and feeding its bigfloat tan() the exact decimal value corresponding to the binary double denoted by 1.5707963267948961, it reproduces the result the test is expecting. Ditto by writing my own tan function, using decimal with 50 digits, using the straightforward series expansions for sin and cos, and then dividing to get tan (slow, but without any attempt at "cleverness" it's easy to trust it).

    So if fdlibm's author were still active, I'm sure he'd consider this to be a bug - it _intended_ to have < 1 ulp error in almost all cases.

    It's not argument reduction that's hurting it. fdlibm stores pi/2 to waaaay more bits than are needed to get the right answer via -1/tan(x - pi/2).

    Regardless, there's nothing Python can do about it short of changing the test. As the Python results I pasted before show, the OpenBSD answer _is_ correct for some (non-representable in double precision) input value within 1 ulp of the actual input value. So it's a very demanding test to get right.

    @tim-one
    Copy link
    Member

    tim-one commented Oct 2, 2017

    If someone opens a bug report with OpenBSD, or just for us to get more info, it could be useful to have a larger universe of troublesome tan inputs to stare at. So the attached tanny.py supplies them, testing all inputs within 100 ulps of math.pi/2 (or change N=100 to whatever you like).

    There are no failures on my 64-bit Win10 3.6.1.

    The "correct" answers are computed by using mpmath.tan() set to 200 mantissa bits, then rounding back to 53 bits. These all match the results from my dirt-dumb decimal tan() rounded back, but it's better to trust a widely-used package.

    Of course mpmath needs to be installed ("pip install mpmath" works fine):

    http://mpmath.org/
    

    Nothing else is needed (while mpmath will exploit gmpy if it's installed, by default it sticks to pure Python code).

    @tim-one
    Copy link
    Member

    tim-one commented Oct 2, 2017

    Thanks for tanny-openbsd.txt, Serhiy! OpenBSD didn't get anywhere close to the best answer on any of those 201 inputs. I was hoping we could, e.g., test something a little more removed from pi/2 - but even its best cases in this range are hundreds of millions of ulps off :-(

    @tim-one
    Copy link
    Member

    tim-one commented Oct 3, 2017

    When Sun was developing fdlibm, I was (among other things) working on a proprietary libm for Kendall Square Research. I corresponded with fdlibm's primary author (KC Ng) often at the time. There's no way he would have left errors this egregious slip by, or neglect careful testing near singularities.

    But writing portable code at this level is very delicate, and fdlibm failed in many ways when I first compiled it for the KSR machine. That was the first 64-bit box fdlibm was tried on, & it exposed many unwarranted assumptions in the code.

    Browsing the OpenBSD port, they bumped into more, seemingly driven by gcc violating assumptions about how tricky casting would work - there are many places in the code that, e.g., want to view a single double as a pair of int32s (or vice versa), at the bit level. An aggressive compiler can easily be provoked into reordering operations in fatally damaging ways, if it doesn't realize that, under the maze of cast and union tricks, successive operations are actually operating on the _same_ bits.

    So, if possible, it would be great to try compiling OpenBSD's libm with every conceivable optimization disabled, to see if that changes results.

    Why I suspect that may be promising: as Stefan pointed out, for arguments near pi/2 fdlibm's tan() subtracts pi/2 and evaluates the result as tan(x) = -1/tan(x - pi/2). But these arguments are _so_ close to pi/2 that |x - pi/2| is "tiny". But for tiny arguments z, tan(z) == z to machine precision (tan(z) = sin(z) / cos(z) ~= z/1 = z for tiny |z|). So it's effectively using tan(x) = -1/(x - pi/2) in this range.

    In the original failing, test, behold:

    >>> (1.0 / 1978937966095219.0).hex() # correct
    '0x1.234c4c6628b81p-51'
    
    >>> (1.0 / 1978945885716843.0).hex() # OpenBSD
    '0x1.234c000000000p-51'

    It's very hard to believe all those trailing zero bits in the OpenBSD reciprocal are a coincidence. It's extremely hard to believe that fdlibm _intended_ them to be all zeroes. But it's pretty easy to believe an aggressive optimizer managed to lose the intent of the code.

    @tim-one
    Copy link
    Member

    tim-one commented Oct 17, 2017

    On 16 Oct 2017, exactly the same test failures were reported on python-dev:

    https://mail.python.org/pipermail/python-dev/2017-October/149880.html

    From the test output posted there:

    """
    == CPython 3.6.3 (default, Oct 16 2017, 14:42:21) [GCC 4.7.2]
    == Linux-3.2.0-4-686-pae-i686-with-debian-7.11 little-endian
    """

    And I just noticed that in bpo-27953 exactly the same bad tan result() was reported (on "OS X Tiger").

    It's hard to believe that three different systems (four, if I include wxMaxima) deliver exactly the same bad result by accident.

    @serhiy-storchaka
    Copy link
    Member Author

    There is the same result on NetBSD 7.1.

    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life tests Tests in the Lib/test dir and removed extension-modules C modules in the Modules dir labels Oct 29, 2017
    @serhiy-storchaka serhiy-storchaka changed the title math.tan has poor accuracy near pi/2 on OpenBSD math.tan has poor accuracy near pi/2 on OpenBSD and NetBSD Oct 29, 2017
    @tim-one
    Copy link
    Member

    tim-one commented Oct 29, 2017

    BTW, has anyone tried running a tiny C program on these platforms to see what tan(1.5707963267948961) delivers? The kind of code fdlibm uses is sensitive not only to compiler (mis)optimization, but also to stuff like how the FPU's "precision control" is set, and to whether fused multiply-add is being used, and ...

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Oct 29, 2017

    Also, does this occur in a VM on on the bare metal or both?

    What leaves me puzzled is that I cannot reproduce the Linux x86 report
    with almost the exact same compiler. But I'm on x64 and use -m32.

    @serhiy-storchaka
    Copy link
    Member Author

    I have got both (!) results in the same binary on NetBSD (gcc 4.8.5).

    tan(1.57079632679489611) = 1978937966095219.000000
    tan(0x1.921fb54442d16p+0) = 0x1.c1f559a01adccp+50
    tan(1.57079632679489611) = 1978945885716843.000000
    tan(0x1.921fb54442d16p+0) = 0x1.c1f5cfa3105acp+50

    Seems the first result is calculated at compile time while the second result is calculated at run time.

    On OpenBSD (gcc 4.2.1):

    tan(1.57079632679489611) = 1978945885716843.000000
    tan(0x1.921fb54442d16p+0) = 0x1.c1f5cfa3105acp+50
    tan(1.57079632679489611) = 1978945885716843.000000
    tan(0x1.921fb54442d16p+0) = 0x1.c1f5cfa3105acp+50

    On Linux (gcc 7.2.0) and FreeBSD (clang 4.0.0):

    tan(1.57079632679489611) = 1978937966095219.000000
    tan(0x1.921fb54442d16p+0) = 0x1.c1f559a01adccp+50
    tan(1.57079632679489611) = 1978937966095219.000000
    tan(0x1.921fb54442d16p+0) = 0x1.c1f559a01adccp+50

    @mdickinson
    Copy link
    Member

    Seems the first result is calculated at compile time

    Makes sense. Last time I looked, gcc uses MPFR for the compile-time calls, so I'd expect those to be correctly rounded.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Oct 31, 2017

    So the big mystery is still:

    https://mail.python.org/pipermail/python-dev/2017-October/149880.html

    Could be a Linux router with some alternative libc ...

    @serhiy-storchaka
    Copy link
    Member Author

    From 4 considered results the tests are failed on gcc 4.2.1, 4.7.2, 4.8.5, but are passes on gcc 7.2.0. I suppose this is gcc or libc bug fixed in recent versions.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Oct 31, 2017

    On Tue, Oct 31, 2017 at 07:32:00PM +0000, Serhiy Storchaka wrote:

    >>From 4 considered results the tests are failed on gcc 4.2.1, 4.7.2, 4.8.5, but are passes on gcc 7.2.0. I suppose this is gcc or libc bug fixed in recent versions.

    4.7.3 passes here with -m32.

    @mdickinson
    Copy link
    Member

    The big mystery for me is not "Why this is occurring in the first place?" but "What should we do about it?"

    I'm really reluctant to (even conditionally) skip the test, because it's doing exactly what it's designed to do, namely detecting and reporting that Python is giving poor results in this corner case on this platform. As developers, we can blame the poor results on the platform's libm, but that doesn't help the user.

    If we can find combinations of compiler flags that fix the issue for this platform, then I guess that's a solution. But what if we can't? Do we simply accept that the test suite will fail on this platform (and perhaps add a visible comment somewhere that this is a known issue, with a link to this bpo issue)?

    We can certainly hack together a tan implementation that behaves better in these corner cases; it's not totally trivial, but not all that hard, either (I suspect that simply using sin / cos would already be an improvement.) But if we do this every time we encounter a platform-specific libm corner case, we'll end up with a maze of twisty windy workarounds in mathmodule.c ...

    Or we could drop the fine-grained math library tests altogether: if the purpose of the math module is to wrap the platform libm and return the results it's returning, then we're already doing that perfectly well, so the tests are at best useless, and at worst problematic in cases like this. But I'm not so happy with that solution either.

    Thoughts?

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Nov 1, 2017

    On Wed, Nov 01, 2017 at 06:17:44PM +0000, Mark Dickinson wrote:

    I'm really reluctant to (even conditionally) skip the test, because it's doing exactly what it's designed to do, namely detecting and reporting that Python is giving poor results in this corner case on this platform. As developers, we can blame the poor results on the platform's libm, but that doesn't help the user.

    I would want the tests to fail, especially if it can occur on Linux (though I
    still suspect an i686 router in the reported case).

    Perhaps hide them behind -uall, so that *we* see the result but just
    executing ./python -m test passes.

    @tim-one
    Copy link
    Member

    tim-one commented Nov 1, 2017

    Since fdlibm uses tan(x) ~= -1/(x-pi/2) in this range, and the reciprocals of the bad results have a whole of bunch of trailing zero bits, my guess is that argument reduction (the "x-pi/2" part) is screwing up (losing bits of pi/2 beyond the long string of initial bits that cancel out). In which case it's likely sin(x) will return a poor result too, for the same reason (while cos(x) will return 1.0). IOW, I'd be surprised if sin(x)/cos(x) were materially better. If so, it's not just tan() that's flawed. That can be checked easily enough (e.g., change the Python program I posted to use sin() instead of tan()).

    Regardless, assuming we don't want to write our own libm, it's highly desirable that platform experts be aware of the flaw(s) on platforms where it occurs. Otherwise they'll never make the noise needed to get it fixed.

    OTOH, it's not an error _in_ Python if we don't supply libm, so the Python test suite really shouldn't fail on these. Spray warnings to stderr? Create a new "platform (lack of) quality" class of soft failure? "Pass or fail, period" misses the mark here :-(

    @tim-one
    Copy link
    Member

    tim-one commented Nov 1, 2017

    Oops! I mixed up sin and cos in that comment. If it's argument reduction that's broken, then for x near pi/2 cos(x) will be evaluated as -sin(x - pi/2), which is approximately -(x - pi/2), and so error in argument reduction (the "x - pi/2" part) will show up directly in the cos() result. So to confirm or refute that, you could replace tan by cos in the Python program I posted.

    @mdickinson
    Copy link
    Member

    I'd be surprised if sin(x)/cos(x) were materially better.

    Yep. I made the same assumption as you, but then "realised" that to get to the tan tests, we must already have passed all the cos tests, so cos must be okay.

    I thought I'd written cos tests near pi/2 as well as tan ones, but unfortunately it turns out that's not true. My bad. Those tests should probably be added to cmath_testcases.txt.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Nov 18, 2017

    Tim has mentioned the high quality of fdlibm, and indeed I cannot
    reproduce the issue:

    wget -r http://www.netlib.org/fdlibm/
    

    Then build libm with or without optimizations, with or without -m32,
    gcc or clang.

    Then compile a C program that calls tan(1.5707963267948961).

    This is a bit treacherous, because gcc apparently has a builtin
    tan(), so one needs to make sure that it actually uses fdlibm.

    gcc-4.7 -fno-builtin -O3 -m32 -Wall -W -o xxx xxx.c libm.a

    All is fine here (Ubuntu Linux).

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Nov 18, 2017

    I found an unused i386 box with OpenBSD and Linux (so no VM):

    A C program with tan(1.5707963267948961) is wrong on both systems.

    fdlibm (directly from netlib.org) is correct on both systems.

    Both OS versions are relatively old, so I cannot file an upstream issue.

    I was planning to dig through the fdlibm code if there was an issue,
    but it produces the correct results.

    @tim-one
    Copy link
    Member

    tim-one commented Nov 20, 2017

    Best I can tell, the fdlibm 5.3 on netlib was released in 2002, and essentially stopped existing as a maintained project then. Everyone else copied the source code, and made their own changes independently ever since :-( At least the folks behind the Julia language have made some effort to resurrect it as its own project:

    http://openlibm.org/

    Mark noted that GCC does use a different math library for tan() calls it can evaluate at compile-time. That appears to have started in gcc 4.3:

    https://gcc.gnu.org/gcc-4.3/changes.html#mpfropts

    In any case ... are there are any test failures here on a _current_ OS/platform? If it's only on out-of-date platforms, I'd be content to just suppress the failures on those. Unless the number of affected systems is so large that their identifiers won't fit in a file ;-)

    @serhiy-storchaka
    Copy link
    Member Author

    Do you consider the recent stable releases of OpenBSD 6.1 (April 11, 2017) and NetBSD 7.1 (March 11, 2017) out-of-date platforms? The develop version of NetBSD 8 has the same failure.

    @tim-one
    Copy link
    Member

    tim-one commented Nov 20, 2017

    I have no opinion about any version of xxxBSD, because I've never used one ;-) If current versions of those do have this failure, has anyone opened a bug report on _their_ tracker(s)? I've seen no reason yet to imagine these failures are a fault in Python.

    @serhiy-storchaka
    Copy link
    Member Author

    After fixing platform.libc_ver() in bpo-26544 we can use it for conditional skipping this assertion depending on the libc version.

    @serhiy-storchaka serhiy-storchaka added the 3.8 only security fixes label Jul 9, 2018
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @encukou
    Copy link
    Member

    encukou commented Apr 10, 2024

    Hello from 2024!
    Per PEP-11, OpenBSD and NetBSD are not supported platforms.
    Generally we try to be friendly to people porting to these architectures and supporting the port, but a test failing because of a libm bug is something to be patched or ignored downstream -- or ideally fixed in the libm. (Maybe it already is, didn't check.)

    I'll close the bug and PR.
    Like other issues for unsupported platforms, it'll be listed on the GitHub project for anyone who wants to support the platform unofficially or bring support back.
    I encourage such people to add themselves to the Experts list so we know whom to ping.

    @encukou encukou closed this as not planned Won't fix, can't repro, duplicate, stale Apr 10, 2024
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes OS-unsupported tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    Status: No status
    Development

    No branches or pull requests

    5 participants