classification
Title: clang 4.0 miscompiles dtoa.c, giving broken float parsing and printing
Type: behavior Stage: resolved
Components: Build, Interpreter Core, Tests Versions: Python 3.7, Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: dim, emaste, eric.smith, haypo, koobs, mark.dickinson
Priority: critical Keywords:

Created on 2017-04-19 20:39 by haypo, last changed 2017-05-02 14:11 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
dtoa2.c haypo, 2017-04-20 16:22
dtoa5.c haypo, 2017-04-20 17:52
Pull Requests
URL Status Linked Edit
PR 1221 merged haypo, 2017-04-20 22:17
PR 1233 merged haypo, 2017-04-21 10:12
PR 1340 merged haypo, 2017-04-28 11:00
PR 1376 merged haypo, 2017-05-02 08:27
PR 1377 merged haypo, 2017-05-02 09:02
Messages (26)
msg291901 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-19 20:39
Since the build 154, many tests fail on the AMD64 FreeBSD CURRENT Debug 3.x buildbot slave because of float rounding errors. Failing tests:

* test_cmath
* test_float
* test_json
* test_marshal
* test_math
* test_statistics
* test_strtod

http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Non-Debug%203.x/builds/154/steps/test/logs/stdio

Problem: none of build 154 changes are related to floats

* commit f9f87f0934ca570293ba7194bed3448a7f9bf39c
* commit 947629916a5ecb1f6f6792e9b9234e084c5bf274

It looks more like a libc change of the FreeBSD CURRENT slave.

Example of errors:

======================================================================
FAIL: test_specific_values (test.test_cmath.CMathTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_cmath.py", line 420, in test_specific_values
    msg=error_message)
  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_cmath.py", line 149, in rAssertAlmostEqual
    '{!r} and {!r} are not sufficiently close'.format(a, b))
AssertionError: acos0036: acos(complex(-1.0009999999999992, 0.0))
Expected: complex(3.141592653589793, -0.04471763360830684)
Received: complex(3.141592653589793, -0.04471763360829195)
Received value insufficiently close to expected value.

======================================================================
FAIL: test_floats (test.test_json.test_float.TestPyFloat)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/test_json/test_float.py", line 8, in test_floats
    self.assertEqual(float(self.dumps(num)), num)
AssertionError: 1.9275814160560202e-50 != 1.9275814160560204e-50

FAIL: test_bigcomp (test.test_strtod.StrtodTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_strtod.py", line 213, in test_bigcomp
    self.check_strtod(s)
  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_strtod.py", line 104, in check_strtod
    "expected {}, got {}".format(s, expected, got))
AssertionError: '0x1.8265ea9f864bcp+579' != '0x1.8265ea9f864bdp+579'
- 0x1.8265ea9f864bcp+579
?                 ^
+ 0x1.8265ea9f864bdp+579
?                 ^
 : Incorrectly rounded str->float conversion for 29865e170: expected 0x1.8265ea9f864bcp+579, got 0x1.8265ea9f864bdp+579
msg291983 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 15:41
test_strtod fails on FreeBSD using clang 4.0, but pass using clang 3.8.

Links:

* https://gcc.gnu.org/wiki/FloatingPointMath
* https://gcc.gnu.org/wiki/Math_Optimization_Flags
* ICC fp-model: https://software.intel.com/en-us/node/682946

GCC options:

* -ffp-contract=off / -ffp-contract=fast
* -fp-model (not supported by clang)
* -mfpmath=sse: no effect on Clang, it's already the default and -mfpmath=387 is not supported on x86_64 ISA
* -ffast-math
* -fexcess-precision=standard: not supported by clang
msg291984 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 16:22
I created a test case: attached dtoa2.c.

Ok with GCC 6.3.1 (on Fedora 25) and clang 3.8 (on FreeBSD 11) with -O3:

haypo@selma$ gcc -O3 dtoa2.c -o x && ./x
text: 29865e170 -> float: 2.9865e+174
0x1.8265ea9f864bcp+579
{bc 64 f8 a9 5e 26 28 64}

[haypo@freebsd ~/prog/python/master]$ clang -O3 dtoa2.c -o x && ./x
text: 29865e170 -> float: 2.9865e+174
0x1.8265ea9f864bcp+579
{bc 64 f8 a9 5e 26 28 64}


Still ok with clang 3.8 with -O1:

[haypo@freebsd ~/prog/python/master]$ clang40 -O1 dtoa2.c -o x && ./x
text: 29865e170 -> float: 2.9865e+174
0x1.8265ea9f864bcp+579
{bc 64 f8 a9 5e 26 28 64}


Error with clang 4.0 using -O2 (on FreeBSD 11):

[haypo@freebsd ~/prog/python/master]$ clang40 -O2 dtoa2.c -o x && ./x
text: 29865e170 -> float: 2.9865e+174
0x1.8265ea9f864bdp+579    <~~~~~ HERE, bd instead of bc
{bd 64 f8 a9 5e 26 28 64} <~~~~~ HERE, bd instead of bc
msg291985 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 16:24
> Still ok with clang 3.8 with -O1:

Sorry, you should read clang 4.0 here.
msg291987 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 17:52
Much simpler test:

[haypo@freebsd ~/prog/python/master]$ clang40 -O1 dtoa5.c -o x && ./x
k=1
aadj=3.364
test ok
[haypo@freebsd ~/prog/python/master]$ clang40 -O2 dtoa5.c -o x && ./x
k=1
aadj=1.682
test FAILED: BUG!
msg291992 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-20 18:39
Why is -ffast-math being used? That seems like it's asking for trouble.
msg291994 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-20 18:52
Hmm. Just looked at dto5.c; nice work reducing it to something so simple.

This is looking a bit like either a strict-aliasing related problem, or a compiler bug, or both. My understanding of the "union trick" that dtoa.c uses is that it's well-defined under C99+TC3 (though that wasn't clear under the original C99), so it shouldn't be legitimate for an optimisation to cause different results to be produced.

So this is smelling like a compiler bug to me.
msg292000 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 21:32
> Why is -ffast-math being used? That seems like it's asking for trouble.

Sorry, my comment was unclear: I only tried to list all compiler options which have an impact on floating point numbers.

> So this is smelling like a compiler bug to me.

Right, I will report it to LLVM (clang).
msg292001 - (view) Author: Dimitry Andric (dim) * Date: 2017-04-20 21:46
This is most likely the same issue we found in https://bugs.freebsd.org/216770, which was reported upstream to LLVM here: https://bugs.llvm.org//show_bug.cgi?id=31928, and resulted in a very long and not really productive discussion about whether type punning in this way is officially allowed, or a GNU extension.  I will gladly leave it to language lawyers. :)

In our case, easy fix was to use -fno-strict-aliasing, as we did here:
https://svnweb.freebsd.org/changeset/base/313706
msg292013 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 07:26
Dimitry: thank you! Indeed, that looks like the very same issue. It's a shame to have to use -fno-strict-aliasing here, since that might prevent legitimate optimisations elsewhere in the code. But if that's what we have to do to maintain correctness with clang, then I guess we have little choice.
msg292014 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 07:46
I seem the same issues on OS X (10.9.5) when compiling with clang 4.0 from MacPorts. So this doesn't seem to be FreeBSD-specific.

Changing title and raising priority: failure for repr (and marshal, JSON, etc.) to roundtrip seems like a critical breakage to me.

mdickinson$ CC=/opt/local/bin/clang-mp-4.0 ./configure && make && ./python.exe -m test test_strtod
[... log from successful build omitted ...]
Run tests sequentially
0:00:00 [1/1] test_strtod
test test_strtod failed -- multiple errors occurred; run in verbose mode for details
test_strtod failed

1 test failed:
    test_strtod

Total duration: 65 ms
Tests result: FAILURE
msg292017 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 09:24
New changeset 28205b203a4742c40080b4a2b4b2dcd800716edc by Victor Stinner in branch 'master':
bpo-30104: Use -fno-strict-aliasing on clang (#1221)
https://github.com/python/cpython/commit/28205b203a4742c40080b4a2b4b2dcd800716edc
msg292019 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 09:31
Mark: "I seem the same issues on OS X (10.9.5) when compiling with clang 4.0 from MacPorts. So this doesn't seem to be FreeBSD-specific."

I confirm, I reproduced the bug on Linux using Clang 4.0.

Mark: "But if that's what we have to do to maintain correctness with clang, then I guess we have little choice."

I pushed my change adding -fno-strict-aliasing compiler option to clang. I created the issue #30124 to fix the aliasing issue in dtoa.c.

I didn't check yet if we have to backport the -fno-strict-aliasing change, maybe we can wait until the aliasing issue is fixed and backport this future fix?
msg292021 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 10:04
FWIW, I can confirm that adding `-fno-strict-aliasing` fixes all failing float-related tests on OS X.
msg292022 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 10:13
Oh... I only tested my configure patch on Linux using ./configure CC=clang, but the change has no effect on FreeBSD where $CC is /usr/bin/cc and so my code doesn't detect clang correctly.

I proposed https://github.com/python/cpython/pull/1233 which runs "cc --version" to check if it contains clang. I tested this change on FreeBSD: it works ;-) (it adds -fno-strict-aliasing)
msg292024 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 10:35
New changeset 35f3d240ee5f0958034bd500949b08764e36f4dc by Victor Stinner in branch 'master':
bpo-30104: configure now detects when cc is clang (#1233)
https://github.com/python/cpython/commit/35f3d240ee5f0958034bd500949b08764e36f4dc
msg292028 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 11:59
Ok, the FreeBSD CURRENT buildbot is repaired: "AMD64 FreeBSD CURRENT
Debug 3.x is complete: Success".
msg292208 - (view) Author: Kubilay Kocak (koobs) Date: 2017-04-24 08:45
Thank you for investigating Victor.

This appears to require backporting to 3.6 and 3.5 which are also failing:

FAIL: test_repr (test.test_float.ReprTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/buildbot/python/3.5.koobs-freebsd-current.nondebug/build/Lib/test/test_float.py", line 674, in test_repr
    self.assertEqual(v, eval(repr(v)))
AssertionError: 5.311000000000001e+245 != 5.311000000000002e+245

----------------------------------------------------------------------
Ran 42 tests in 0.242s

FAILED (failures=1, skipped=1)
7 tests failed again:
    test_cmath test_float test_json test_marshal test_math
    test_statistics test_strtod
msg292209 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-24 08:48
> This appears to require backporting to 3.6 and 3.5 which are also failing:

I'm aware of that and I plan to fix these versions as well (2.7 also, no?), but I would like to first decide if we "fix" dtoa.c aliasing, or if we restrict -fno-strict-aliasing flag on dtoa.c: see discussion on the issue #30124. It seems like restricting the flag to dtoa.c is the favorite option.
msg292218 - (view) Author: Kubilay Kocak (koobs) Date: 2017-04-24 10:56
2.7 branch on koobs-freebsd-current worker doesn't appear to be failing currently, but there may be differences in (or a lack of equivalent) coverage compared to 3.*.

I also note that -fno-strict-aliasing is being included in 2.7's compile arguments:

cc -pthread -c -fno-strict-aliasing -OPT:Olimit=0 -g -O2 -g -O0 -Wall -Wstrict-prototypes
msg292227 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-24 12:14
> I also note that -fno-strict-aliasing is being included in 2.7's compile arguments

Ah right, PyObject structures of Python 2 doesn't respect strict aliasing...
msg292526 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-28 11:05
I would like to fix FreeBSD CURRENT buildbots of Python 2.7, 3.5 and 3.6, so here my attempt to restrict the -fno-strict-aliasing option to the dtoa.c file: https://github.com/python/cpython/pull/1340

I chose to add the flag for any C compiler. If you think that a C compiler may not support that option, we can start by only adding the option on clang, and maybe add it on other C compilers if someone complains.
msg292531 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-28 13:07
New changeset 826f83f1d562a7b878499bc3af2267cfdfe5f2f9 by Victor Stinner in branch 'master':
bpo-30104: Only use -fno-strict-aliasing on dtoa.c (#1340)
https://github.com/python/cpython/commit/826f83f1d562a7b878499bc3af2267cfdfe5f2f9
msg292728 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-05-02 08:47
New changeset 809101f14f27ddb394cd77c477470761ecf99f41 by Victor Stinner in branch '3.6':
bpo-30104: Use -fno-strict-aliasing on clang (#1376)
https://github.com/python/cpython/commit/809101f14f27ddb394cd77c477470761ecf99f41
msg292730 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-05-02 09:16
New changeset 03a7ab77d2f75323e1f3e8b6a1c164e701d58bfb by Victor Stinner in branch '3.5':
bpo-30104: Use -fno-strict-aliasing on clang (#1376) (#1377)
https://github.com/python/cpython/commit/03a7ab77d2f75323e1f3e8b6a1c164e701d58bfb
msg292760 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-05-02 14:11
dtoa.c is now compiled with -fno-string-aliasing (for any clang version, not only clang 4.0) on Python 3.5, 3.6 and 3.7.

It was decided to not touch dtoa.c to not diverge from upstream.

Thanks Dimitry Andric and Mark Dickinson for your reviews and support in this cryptic issue!

Note 1: We consider that Clang 4.0 is wrong, whereas GCC respects the C99 standard for aliasing on unions. But I don't think that it matters much who is wrong or not :-)

Note 2: Python 2.7 already used -fno-strict-aliasing on all .c files because of its design of PyObject structures, fixed in Python 3.
History
Date User Action Args
2017-05-02 14:11:03hayposetstatus: open -> closed
resolution: fixed
messages: + msg292760

stage: backport needed -> resolved
2017-05-02 09:16:24hayposetmessages: + msg292730
2017-05-02 09:02:02hayposetpull_requests: + pull_request1485
2017-05-02 08:47:40hayposetmessages: + msg292728
2017-05-02 08:27:46hayposetpull_requests: + pull_request1484
2017-04-28 13:07:12hayposetmessages: + msg292531
2017-04-28 11:05:10hayposetmessages: + msg292526
2017-04-28 11:00:07hayposetpull_requests: + pull_request1451
2017-04-24 12:14:07hayposetmessages: + msg292227
2017-04-24 10:56:51koobssetmessages: + msg292218
2017-04-24 08:48:14hayposetmessages: + msg292209
2017-04-24 08:46:49koobssettype: behavior
stage: backport needed
2017-04-24 08:45:47koobssetmessages: + msg292208
versions: + Python 3.5, Python 3.6
2017-04-21 11:59:01hayposetmessages: + msg292028
2017-04-21 10:35:26hayposetmessages: + msg292024
2017-04-21 10:13:29hayposetmessages: + msg292022
2017-04-21 10:12:11hayposetpull_requests: + pull_request1351
2017-04-21 10:07:03eric.smithsetnosy: + eric.smith
2017-04-21 10:04:39mark.dickinsonsetmessages: + msg292021
2017-04-21 09:31:41hayposetmessages: + msg292019
2017-04-21 09:24:36hayposetmessages: + msg292017
2017-04-21 08:49:41mark.dickinsonsetcomponents: + Build
2017-04-21 07:46:10mark.dickinsonsetpriority: normal -> critical

messages: + msg292014
title: Float rounding errors on AMD64 FreeBSD CURRENT Debug 3.x buildbot -> clang 4.0 miscompiles dtoa.c, giving broken float parsing and printing
2017-04-21 07:26:39mark.dickinsonsetmessages: + msg292013
2017-04-20 23:33:42emastesetnosy: + emaste
2017-04-20 22:17:40hayposetpull_requests: + pull_request1344
2017-04-20 21:46:32dimsetnosy: + dim
messages: + msg292001
2017-04-20 21:32:08hayposetmessages: + msg292000
2017-04-20 18:52:16mark.dickinsonsetmessages: + msg291994
2017-04-20 18:39:52mark.dickinsonsetmessages: + msg291992
2017-04-20 17:52:13hayposetfiles: + dtoa5.c

messages: + msg291987
2017-04-20 16:24:47hayposetmessages: + msg291985
2017-04-20 16:22:49hayposetfiles: + dtoa2.c

messages: + msg291984
2017-04-20 15:41:33hayposetmessages: + msg291983
2017-04-19 20:39:39hayposetnosy: + mark.dickinson
components: + Interpreter Core, Tests
2017-04-19 20:39:22haypocreate