Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang 4.0 miscompiles dtoa.c, giving broken float parsing and printing #74290

Closed
vstinner opened this issue Apr 19, 2017 · 26 comments
Closed

clang 4.0 miscompiles dtoa.c, giving broken float parsing and printing #74290

vstinner opened this issue Apr 19, 2017 · 26 comments
Labels
3.7 (EOL) end of life build The build process and cross-build interpreter-core (Objects, Python, Grammar, and Parser dirs) tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@vstinner
Copy link
Member

BPO 30104
Nosy @mdickinson, @vstinner, @ericvsmith, @koobs, @grimreaper, @DimitryAndric
PRs
  • bpo-30104: Use -fno-strict-aliasing on clang #1221
  • bpo-30104: configure now detects when cc is clang #1233
  • bpo-30104: Compile dtoa.c without strict aliasing #1340
  • [3.6] bpo-30104: Use -fno-strict-aliasing on clang #1376
  • [3.5] bpo-30104: Use -fno-strict-aliasing on clang (#1376) #1377
  • Files
  • dtoa2.c
  • dtoa5.c
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-05-02.14:11:03.798>
    created_at = <Date 2017-04-19.20:39:22.235>
    labels = ['interpreter-core', 'type-bug', 'tests', 'build', '3.7']
    title = 'clang 4.0 miscompiles dtoa.c, giving broken float parsing and printing'
    updated_at = <Date 2018-05-16.06:43:14.700>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2018-05-16.06:43:14.700>
    actor = 'eitan.adler'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-05-02.14:11:03.798>
    closer = 'vstinner'
    components = ['Build', 'Interpreter Core', 'Tests']
    creation = <Date 2017-04-19.20:39:22.235>
    creator = 'vstinner'
    dependencies = []
    files = ['46817', '46818']
    hgrepos = []
    issue_num = 30104
    keywords = []
    message_count = 26.0
    messages = ['291901', '291983', '291984', '291985', '291987', '291992', '291994', '292000', '292001', '292013', '292014', '292017', '292019', '292021', '292022', '292024', '292028', '292208', '292209', '292218', '292227', '292526', '292531', '292728', '292730', '292760']
    nosy_count = 7.0
    nosy_names = ['mark.dickinson', 'vstinner', 'eric.smith', 'koobs', 'eitan.adler', 'emaste', 'dim']
    pr_nums = ['1221', '1233', '1340', '1376', '1377']
    priority = 'critical'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue30104'
    versions = ['Python 3.5', 'Python 3.6', 'Python 3.7']

    @vstinner
    Copy link
    Member Author

    Since the build 154, many tests fail on the AMD64 FreeBSD CURRENT Debug 3.x buildbot slave because of float rounding errors. Failing tests:

    • test_cmath
    • test_float
    • test_json
    • test_marshal
    • test_math
    • test_statistics
    • test_strtod

    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Non-Debug%203.x/builds/154/steps/test/logs/stdio

    Problem: none of build 154 changes are related to floats

    It looks more like a libc change of the FreeBSD CURRENT slave.

    Example of errors:

    ======================================================================
    FAIL: test_specific_values (test.test_cmath.CMathTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_cmath.py", line 420, in test_specific_values
        msg=error_message)
      File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_cmath.py", line 149, in rAssertAlmostEqual
        '{!r} and {!r} are not sufficiently close'.format(a, b))
    AssertionError: acos0036: acos(complex(-1.0009999999999992, 0.0))
    Expected: complex(3.141592653589793, -0.04471763360830684)
    Received: complex(3.141592653589793, -0.04471763360829195)
    Received value insufficiently close to expected value.

    ======================================================================
    FAIL: test_floats (test.test_json.test_float.TestPyFloat)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/test_json/test_float.py", line 8, in test_floats
        self.assertEqual(float(self.dumps(num)), num)
    AssertionError: 1.9275814160560202e-50 != 1.9275814160560204e-50

    FAIL: test_bigcomp (test.test_strtod.StrtodTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_strtod.py", line 213, in test_bigcomp
        self.check_strtod(s)
      File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_strtod.py", line 104, in check_strtod
        "expected {}, got {}".format(s, expected, got))
    AssertionError: '0x1.8265ea9f864bcp+579' != '0x1.8265ea9f864bdp+579'
    - 0x1.8265ea9f864bcp+579
    ?                 ^
    + 0x1.8265ea9f864bdp+579
    ?                 ^
     : Incorrectly rounded str->float conversion for 29865e170: expected 0x1.8265ea9f864bcp+579, got 0x1.8265ea9f864bdp+579

    @vstinner vstinner added 3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) tests Tests in the Lib/test dir labels Apr 19, 2017
    @vstinner
    Copy link
    Member Author

    test_strtod fails on FreeBSD using clang 4.0, but pass using clang 3.8.

    Links:

    GCC options:

    • -ffp-contract=off / -ffp-contract=fast
    • -fp-model (not supported by clang)
    • -mfpmath=sse: no effect on Clang, it's already the default and -mfpmath=387 is not supported on x86_64 ISA
    • -ffast-math
    • -fexcess-precision=standard: not supported by clang

    @vstinner
    Copy link
    Member Author

    I created a test case: attached dtoa2.c.

    Ok with GCC 6.3.1 (on Fedora 25) and clang 3.8 (on FreeBSD 11) with -O3:

    haypo@selma$ gcc -O3 dtoa2.c -o x && ./x
    text: 29865e170 -> float: 2.9865e+174
    0x1.8265ea9f864bcp+579
    {bc 64 f8 a9 5e 26 28 64}

    [haypo@freebsd ~/prog/python/master]$ clang -O3 dtoa2.c -o x && ./x
    text: 29865e170 -> float: 2.9865e+174
    0x1.8265ea9f864bcp+579
    {bc 64 f8 a9 5e 26 28 64}

    Still ok with clang 3.8 with -O1:

    [haypo@freebsd ~/prog/python/master]$ clang40 -O1 dtoa2.c -o x && ./x
    text: 29865e170 -> float: 2.9865e+174
    0x1.8265ea9f864bcp+579
    {bc 64 f8 a9 5e 26 28 64}

    Error with clang 4.0 using -O2 (on FreeBSD 11):

    [haypo@freebsd ~/prog/python/master]$ clang40 -O2 dtoa2.c -o x && ./x
    text: 29865e170 -> float: 2.9865e+174
    0x1.8265ea9f864bdp+579 <~~~~~ HERE, bd instead of bc
    {bd 64 f8 a9 5e 26 28 64} <~~~~~ HERE, bd instead of bc

    @vstinner
    Copy link
    Member Author

    Still ok with clang 3.8 with -O1:

    Sorry, you should read clang 4.0 here.

    @vstinner
    Copy link
    Member Author

    Much simpler test:

    [haypo@freebsd ~/prog/python/master]$ clang40 -O1 dtoa5.c -o x && ./x
    k=1
    aadj=3.364
    test ok
    [haypo@freebsd ~/prog/python/master]$ clang40 -O2 dtoa5.c -o x && ./x
    k=1
    aadj=1.682
    test FAILED: BUG!

    @mdickinson
    Copy link
    Member

    Why is -ffast-math being used? That seems like it's asking for trouble.

    @mdickinson
    Copy link
    Member

    Hmm. Just looked at dto5.c; nice work reducing it to something so simple.

    This is looking a bit like either a strict-aliasing related problem, or a compiler bug, or both. My understanding of the "union trick" that dtoa.c uses is that it's well-defined under C99+TC3 (though that wasn't clear under the original C99), so it shouldn't be legitimate for an optimisation to cause different results to be produced.

    So this is smelling like a compiler bug to me.

    @vstinner
    Copy link
    Member Author

    Why is -ffast-math being used? That seems like it's asking for trouble.

    Sorry, my comment was unclear: I only tried to list all compiler options which have an impact on floating point numbers.

    So this is smelling like a compiler bug to me.

    Right, I will report it to LLVM (clang).

    @DimitryAndric
    Copy link
    Mannequin

    DimitryAndric mannequin commented Apr 20, 2017

    This is most likely the same issue we found in https://bugs.freebsd.org/216770, which was reported upstream to LLVM here: https://bugs.llvm.org//show_bug.cgi?id=31928, and resulted in a very long and not really productive discussion about whether type punning in this way is officially allowed, or a GNU extension. I will gladly leave it to language lawyers. :)

    In our case, easy fix was to use -fno-strict-aliasing, as we did here:
    https://svnweb.freebsd.org/changeset/base/313706

    @mdickinson
    Copy link
    Member

    Dimitry: thank you! Indeed, that looks like the very same issue. It's a shame to have to use -fno-strict-aliasing here, since that might prevent legitimate optimisations elsewhere in the code. But if that's what we have to do to maintain correctness with clang, then I guess we have little choice.

    @mdickinson
    Copy link
    Member

    I seem the same issues on OS X (10.9.5) when compiling with clang 4.0 from MacPorts. So this doesn't seem to be FreeBSD-specific.

    Changing title and raising priority: failure for repr (and marshal, JSON, etc.) to roundtrip seems like a critical breakage to me.

    mdickinson$ CC=/opt/local/bin/clang-mp-4.0 ./configure && make && ./python.exe -m test test_strtod
    [... log from successful build omitted ...]
    Run tests sequentially
    0:00:00 [1/1] test_strtod
    test test_strtod failed -- multiple errors occurred; run in verbose mode for details
    test_strtod failed

    1 test failed:
    test_strtod

    Total duration: 65 ms
    Tests result: FAILURE

    @mdickinson mdickinson changed the title Float rounding errors on AMD64 FreeBSD CURRENT Debug 3.x buildbot clang 4.0 miscompiles dtoa.c, giving broken float parsing and printing Apr 21, 2017
    @mdickinson mdickinson added the build The build process and cross-build label Apr 21, 2017
    @vstinner
    Copy link
    Member Author

    New changeset 28205b2 by Victor Stinner in branch 'master':
    bpo-30104: Use -fno-strict-aliasing on clang (bpo-1221)
    28205b2

    @vstinner
    Copy link
    Member Author

    Mark: "I seem the same issues on OS X (10.9.5) when compiling with clang 4.0 from MacPorts. So this doesn't seem to be FreeBSD-specific."

    I confirm, I reproduced the bug on Linux using Clang 4.0.

    Mark: "But if that's what we have to do to maintain correctness with clang, then I guess we have little choice."

    I pushed my change adding -fno-strict-aliasing compiler option to clang. I created the issue bpo-30124 to fix the aliasing issue in dtoa.c.

    I didn't check yet if we have to backport the -fno-strict-aliasing change, maybe we can wait until the aliasing issue is fixed and backport this future fix?

    @mdickinson
    Copy link
    Member

    FWIW, I can confirm that adding -fno-strict-aliasing fixes all failing float-related tests on OS X.

    @vstinner
    Copy link
    Member Author

    Oh... I only tested my configure patch on Linux using ./configure CC=clang, but the change has no effect on FreeBSD where $CC is /usr/bin/cc and so my code doesn't detect clang correctly.

    I proposed #1233 which runs "cc --version" to check if it contains clang. I tested this change on FreeBSD: it works ;-) (it adds -fno-strict-aliasing)

    @vstinner
    Copy link
    Member Author

    New changeset 35f3d24 by Victor Stinner in branch 'master':
    bpo-30104: configure now detects when cc is clang (bpo-1233)
    35f3d24

    @vstinner
    Copy link
    Member Author

    Ok, the FreeBSD CURRENT buildbot is repaired: "AMD64 FreeBSD CURRENT
    Debug 3.x is complete: Success".

    @koobs
    Copy link

    koobs commented Apr 24, 2017

    Thank you for investigating Victor.

    This appears to require backporting to 3.6 and 3.5 which are also failing:

    FAIL: test_repr (test.test_float.ReprTestCase)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/usr/home/buildbot/python/3.5.koobs-freebsd-current.nondebug/build/Lib/test/test_float.py", line 674, in test_repr
        self.assertEqual(v, eval(repr(v)))
    AssertionError: 5.311000000000001e+245 != 5.311000000000002e+245

    Ran 42 tests in 0.242s

    FAILED (failures=1, skipped=1)
    7 tests failed again:
    test_cmath test_float test_json test_marshal test_math
    test_statistics test_strtod

    @koobs koobs added the type-bug An unexpected behavior, bug, or error label Apr 24, 2017
    @vstinner
    Copy link
    Member Author

    This appears to require backporting to 3.6 and 3.5 which are also failing:

    I'm aware of that and I plan to fix these versions as well (2.7 also, no?), but I would like to first decide if we "fix" dtoa.c aliasing, or if we restrict -fno-strict-aliasing flag on dtoa.c: see discussion on the issue bpo-30124. It seems like restricting the flag to dtoa.c is the favorite option.

    @koobs
    Copy link

    koobs commented Apr 24, 2017

    2.7 branch on koobs-freebsd-current worker doesn't appear to be failing currently, but there may be differences in (or a lack of equivalent) coverage compared to 3.*.

    I also note that -fno-strict-aliasing is being included in 2.7's compile arguments:

    cc -pthread -c -fno-strict-aliasing -OPT:Olimit=0 -g -O2 -g -O0 -Wall -Wstrict-prototypes

    @vstinner
    Copy link
    Member Author

    I also note that -fno-strict-aliasing is being included in 2.7's compile arguments

    Ah right, PyObject structures of Python 2 doesn't respect strict aliasing...

    @vstinner
    Copy link
    Member Author

    I would like to fix FreeBSD CURRENT buildbots of Python 2.7, 3.5 and 3.6, so here my attempt to restrict the -fno-strict-aliasing option to the dtoa.c file: #1340

    I chose to add the flag for any C compiler. If you think that a C compiler may not support that option, we can start by only adding the option on clang, and maybe add it on other C compilers if someone complains.

    @vstinner
    Copy link
    Member Author

    New changeset 826f83f by Victor Stinner in branch 'master':
    bpo-30104: Only use -fno-strict-aliasing on dtoa.c (bpo-1340)
    826f83f

    @vstinner
    Copy link
    Member Author

    vstinner commented May 2, 2017

    New changeset 809101f by Victor Stinner in branch '3.6':
    bpo-30104: Use -fno-strict-aliasing on clang (bpo-1376)
    809101f

    @vstinner
    Copy link
    Member Author

    vstinner commented May 2, 2017

    New changeset 03a7ab7 by Victor Stinner in branch '3.5':
    bpo-30104: Use -fno-strict-aliasing on clang (bpo-1376) (bpo-1377)
    03a7ab7

    @vstinner
    Copy link
    Member Author

    vstinner commented May 2, 2017

    dtoa.c is now compiled with -fno-string-aliasing (for any clang version, not only clang 4.0) on Python 3.5, 3.6 and 3.7.

    It was decided to not touch dtoa.c to not diverge from upstream.

    Thanks Dimitry Andric and Mark Dickinson for your reviews and support in this cryptic issue!

    Note 1: We consider that Clang 4.0 is wrong, whereas GCC respects the C99 standard for aliasing on unions. But I don't think that it matters much who is wrong or not :-)

    Note 2: Python 2.7 already used -fno-strict-aliasing on all .c files because of its design of PyObject structures, fixed in Python 3.

    @vstinner vstinner closed this as completed May 2, 2017
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life build The build process and cross-build interpreter-core (Objects, Python, Grammar, and Parser dirs) tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants