classification
Title: clang 4.0 miscompiles dtoa.c, giving broken float parsing and printing
Type: Stage:
Components: Build, Interpreter Core, Tests Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: dim, emaste, eric.smith, haypo, koobs, mark.dickinson
Priority: critical Keywords:

Created on 2017-04-19 20:39 by haypo, last changed 2017-04-21 11:59 by haypo.

Files
File name Uploaded Description Edit
dtoa2.c haypo, 2017-04-20 16:22
dtoa5.c haypo, 2017-04-20 17:52
Pull Requests
URL Status Linked Edit
PR 1221 merged haypo, 2017-04-20 22:17
PR 1233 merged haypo, 2017-04-21 10:12
Messages (17)
msg291901 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-19 20:39
Since the build 154, many tests fail on the AMD64 FreeBSD CURRENT Debug 3.x buildbot slave because of float rounding errors. Failing tests:

* test_cmath
* test_float
* test_json
* test_marshal
* test_math
* test_statistics
* test_strtod

http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Non-Debug%203.x/builds/154/steps/test/logs/stdio

Problem: none of build 154 changes are related to floats

* commit f9f87f0934ca570293ba7194bed3448a7f9bf39c
* commit 947629916a5ecb1f6f6792e9b9234e084c5bf274

It looks more like a libc change of the FreeBSD CURRENT slave.

Example of errors:

======================================================================
FAIL: test_specific_values (test.test_cmath.CMathTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_cmath.py", line 420, in test_specific_values
    msg=error_message)
  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_cmath.py", line 149, in rAssertAlmostEqual
    '{!r} and {!r} are not sufficiently close'.format(a, b))
AssertionError: acos0036: acos(complex(-1.0009999999999992, 0.0))
Expected: complex(3.141592653589793, -0.04471763360830684)
Received: complex(3.141592653589793, -0.04471763360829195)
Received value insufficiently close to expected value.

======================================================================
FAIL: test_floats (test.test_json.test_float.TestPyFloat)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/test_json/test_float.py", line 8, in test_floats
    self.assertEqual(float(self.dumps(num)), num)
AssertionError: 1.9275814160560202e-50 != 1.9275814160560204e-50

FAIL: test_bigcomp (test.test_strtod.StrtodTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_strtod.py", line 213, in test_bigcomp
    self.check_strtod(s)
  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current.nondebug/build/Lib/test/test_strtod.py", line 104, in check_strtod
    "expected {}, got {}".format(s, expected, got))
AssertionError: '0x1.8265ea9f864bcp+579' != '0x1.8265ea9f864bdp+579'
- 0x1.8265ea9f864bcp+579
?                 ^
+ 0x1.8265ea9f864bdp+579
?                 ^
 : Incorrectly rounded str->float conversion for 29865e170: expected 0x1.8265ea9f864bcp+579, got 0x1.8265ea9f864bdp+579
msg291983 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 15:41
test_strtod fails on FreeBSD using clang 4.0, but pass using clang 3.8.

Links:

* https://gcc.gnu.org/wiki/FloatingPointMath
* https://gcc.gnu.org/wiki/Math_Optimization_Flags
* ICC fp-model: https://software.intel.com/en-us/node/682946

GCC options:

* -ffp-contract=off / -ffp-contract=fast
* -fp-model (not supported by clang)
* -mfpmath=sse: no effect on Clang, it's already the default and -mfpmath=387 is not supported on x86_64 ISA
* -ffast-math
* -fexcess-precision=standard: not supported by clang
msg291984 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 16:22
I created a test case: attached dtoa2.c.

Ok with GCC 6.3.1 (on Fedora 25) and clang 3.8 (on FreeBSD 11) with -O3:

haypo@selma$ gcc -O3 dtoa2.c -o x && ./x
text: 29865e170 -> float: 2.9865e+174
0x1.8265ea9f864bcp+579
{bc 64 f8 a9 5e 26 28 64}

[haypo@freebsd ~/prog/python/master]$ clang -O3 dtoa2.c -o x && ./x
text: 29865e170 -> float: 2.9865e+174
0x1.8265ea9f864bcp+579
{bc 64 f8 a9 5e 26 28 64}


Still ok with clang 3.8 with -O1:

[haypo@freebsd ~/prog/python/master]$ clang40 -O1 dtoa2.c -o x && ./x
text: 29865e170 -> float: 2.9865e+174
0x1.8265ea9f864bcp+579
{bc 64 f8 a9 5e 26 28 64}


Error with clang 4.0 using -O2 (on FreeBSD 11):

[haypo@freebsd ~/prog/python/master]$ clang40 -O2 dtoa2.c -o x && ./x
text: 29865e170 -> float: 2.9865e+174
0x1.8265ea9f864bdp+579    <~~~~~ HERE, bd instead of bc
{bd 64 f8 a9 5e 26 28 64} <~~~~~ HERE, bd instead of bc
msg291985 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 16:24
> Still ok with clang 3.8 with -O1:

Sorry, you should read clang 4.0 here.
msg291987 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 17:52
Much simpler test:

[haypo@freebsd ~/prog/python/master]$ clang40 -O1 dtoa5.c -o x && ./x
k=1
aadj=3.364
test ok
[haypo@freebsd ~/prog/python/master]$ clang40 -O2 dtoa5.c -o x && ./x
k=1
aadj=1.682
test FAILED: BUG!
msg291992 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-20 18:39
Why is -ffast-math being used? That seems like it's asking for trouble.
msg291994 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-20 18:52
Hmm. Just looked at dto5.c; nice work reducing it to something so simple.

This is looking a bit like either a strict-aliasing related problem, or a compiler bug, or both. My understanding of the "union trick" that dtoa.c uses is that it's well-defined under C99+TC3 (though that wasn't clear under the original C99), so it shouldn't be legitimate for an optimisation to cause different results to be produced.

So this is smelling like a compiler bug to me.
msg292000 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-20 21:32
> Why is -ffast-math being used? That seems like it's asking for trouble.

Sorry, my comment was unclear: I only tried to list all compiler options which have an impact on floating point numbers.

> So this is smelling like a compiler bug to me.

Right, I will report it to LLVM (clang).
msg292001 - (view) Author: Dimitry Andric (dim) * Date: 2017-04-20 21:46
This is most likely the same issue we found in https://bugs.freebsd.org/216770, which was reported upstream to LLVM here: https://bugs.llvm.org//show_bug.cgi?id=31928, and resulted in a very long and not really productive discussion about whether type punning in this way is officially allowed, or a GNU extension.  I will gladly leave it to language lawyers. :)

In our case, easy fix was to use -fno-strict-aliasing, as we did here:
https://svnweb.freebsd.org/changeset/base/313706
msg292013 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 07:26
Dimitry: thank you! Indeed, that looks like the very same issue. It's a shame to have to use -fno-strict-aliasing here, since that might prevent legitimate optimisations elsewhere in the code. But if that's what we have to do to maintain correctness with clang, then I guess we have little choice.
msg292014 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 07:46
I seem the same issues on OS X (10.9.5) when compiling with clang 4.0 from MacPorts. So this doesn't seem to be FreeBSD-specific.

Changing title and raising priority: failure for repr (and marshal, JSON, etc.) to roundtrip seems like a critical breakage to me.

mdickinson$ CC=/opt/local/bin/clang-mp-4.0 ./configure && make && ./python.exe -m test test_strtod
[... log from successful build omitted ...]
Run tests sequentially
0:00:00 [1/1] test_strtod
test test_strtod failed -- multiple errors occurred; run in verbose mode for details
test_strtod failed

1 test failed:
    test_strtod

Total duration: 65 ms
Tests result: FAILURE
msg292017 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 09:24
New changeset 28205b203a4742c40080b4a2b4b2dcd800716edc by Victor Stinner in branch 'master':
bpo-30104: Use -fno-strict-aliasing on clang (#1221)
https://github.com/python/cpython/commit/28205b203a4742c40080b4a2b4b2dcd800716edc
msg292019 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 09:31
Mark: "I seem the same issues on OS X (10.9.5) when compiling with clang 4.0 from MacPorts. So this doesn't seem to be FreeBSD-specific."

I confirm, I reproduced the bug on Linux using Clang 4.0.

Mark: "But if that's what we have to do to maintain correctness with clang, then I guess we have little choice."

I pushed my change adding -fno-strict-aliasing compiler option to clang. I created the issue #30124 to fix the aliasing issue in dtoa.c.

I didn't check yet if we have to backport the -fno-strict-aliasing change, maybe we can wait until the aliasing issue is fixed and backport this future fix?
msg292021 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2017-04-21 10:04
FWIW, I can confirm that adding `-fno-strict-aliasing` fixes all failing float-related tests on OS X.
msg292022 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 10:13
Oh... I only tested my configure patch on Linux using ./configure CC=clang, but the change has no effect on FreeBSD where $CC is /usr/bin/cc and so my code doesn't detect clang correctly.

I proposed https://github.com/python/cpython/pull/1233 which runs "cc --version" to check if it contains clang. I tested this change on FreeBSD: it works ;-) (it adds -fno-strict-aliasing)
msg292024 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 10:35
New changeset 35f3d240ee5f0958034bd500949b08764e36f4dc by Victor Stinner in branch 'master':
bpo-30104: configure now detects when cc is clang (#1233)
https://github.com/python/cpython/commit/35f3d240ee5f0958034bd500949b08764e36f4dc
msg292028 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-04-21 11:59
Ok, the FreeBSD CURRENT buildbot is repaired: "AMD64 FreeBSD CURRENT
Debug 3.x is complete: Success".
History
Date User Action Args
2017-04-21 11:59:01hayposetmessages: + msg292028
2017-04-21 10:35:26hayposetmessages: + msg292024
2017-04-21 10:13:29hayposetmessages: + msg292022
2017-04-21 10:12:11hayposetpull_requests: + pull_request1351
2017-04-21 10:07:03eric.smithsetnosy: + eric.smith
2017-04-21 10:04:39mark.dickinsonsetmessages: + msg292021
2017-04-21 09:31:41hayposetmessages: + msg292019
2017-04-21 09:24:36hayposetmessages: + msg292017
2017-04-21 08:49:41mark.dickinsonsetcomponents: + Build
2017-04-21 07:46:10mark.dickinsonsetpriority: normal -> critical

messages: + msg292014
title: Float rounding errors on AMD64 FreeBSD CURRENT Debug 3.x buildbot -> clang 4.0 miscompiles dtoa.c, giving broken float parsing and printing
2017-04-21 07:26:39mark.dickinsonsetmessages: + msg292013
2017-04-20 23:33:42emastesetnosy: + emaste
2017-04-20 22:17:40hayposetpull_requests: + pull_request1344
2017-04-20 21:46:32dimsetnosy: + dim
messages: + msg292001
2017-04-20 21:32:08hayposetmessages: + msg292000
2017-04-20 18:52:16mark.dickinsonsetmessages: + msg291994
2017-04-20 18:39:52mark.dickinsonsetmessages: + msg291992
2017-04-20 17:52:13hayposetfiles: + dtoa5.c

messages: + msg291987
2017-04-20 16:24:47hayposetmessages: + msg291985
2017-04-20 16:22:49hayposetfiles: + dtoa2.c

messages: + msg291984
2017-04-20 15:41:33hayposetmessages: + msg291983
2017-04-19 20:39:39hayposetnosy: + mark.dickinson
components: + Interpreter Core, Tests
2017-04-19 20:39:22haypocreate