Issue 35752: test_buffer fails on ppc64le: memoryview pack_single() is miscompiled

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/79933

classification

Title:	test_buffer fails on ppc64le: memoryview pack_single() is miscompiled
Type:		Stage:	resolved
Components:	Library (Lib)	Versions:	Python 3.8, Python 3.7

process

Status:	closed	Resolution:	third party
Dependencies:		Superseder:
Assigned To:	skrah	Nosy List:	David.Edelsohn, cstratak, fweimer, mark.dickinson, serhiy.storchaka, skrah, vstinner
Priority:	normal	Keywords:	patch

Created on 2019-01-16 16:51 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
ppc64_float32_bug.py	vstinner, 2019-01-16 16:51
pack_single.patch	vstinner, 2019-01-16 16:51

Pull Requests
URL	Status	Linked	Edit
PR 11593	closed	vstinner, 2019-01-17 14:27
PR 11593	closed	vstinner, 2019-01-17 14:27
PR 11593	closed	vstinner, 2019-01-17 14:27

Messages (21)
msg333774 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-01-16 16:51
The bug was first reported on Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=1540995 ====================================================================== FAIL: test_memoryview_struct_module (test.test_buffer.TestBufferProtocol) ---------------------------------------------------------------------- Traceback (most recent call last): File "/builddir/build/BUILD/Python-3.6.4/Lib/test/test_buffer.py", line 2540, in test_memoryview_struct_module self.assertEqual(m[1], nd[1]) AssertionError: -21.099998474121094 != -21.100000381469727 The problem is the conversion from C double (64-bit float) and C float (32-bit float). There are 2 implementations: * Objects/memoryobject.c: pack_single() and unpack_single() * Modules/_struct.c: nu_float() and np_float() Attached ppc64_float32_bug.py is the simplified test case to trigger the bug. The result depends on the compiler optimization level: * gcc -O0: -21.100000381469727 == -21.100000381469727, OK * gcc -O1: -21.099998474121094 != -21.100000381469727, ERROR * (I guess that higher optimization level also trigger the bug) The problem is that the pack_single() function is "miscompiled" (or "too optimized"). Adding "volatile" to PACK_SINGLE() prevents the unsafe compiler optimization and fix the issue for me: try attached pack_single_volatile.patch. === -O1 assembler code with the bug === PACK_SINGLE(ptr, d, float); r30 = ptr (gdb) p $vs63.v2_double $17 = {0, -21.100000000000001} => 0x00000000100a1178 <pack_single+1988>: stxsspx vs63,0,r30 (gdb) p /x (ptr)@4 $10 = {0xcc, 0xcc, 0xa8, 0xc1} The first byte is 0xcc: WRONG. === -O1 assembler code without the bug (volatile) === r30 = ptr (gdb) p $f31 $1 = -21.100000000000001 => 0x00000000100a11e4 <pack_single+2096>: frsp f31,f31 (gdb) p $f31 $2 = -21.100000381469727 0x00000000100a11e8 <pack_single+2100>: stfs f31,152(r1) 0x00000000100a11ec <pack_single+2104>: lwz r9,152(r1) (gdb) p /x $r9 $8 = 0xc1a8cccd 0x00000000100a11f0 <pack_single+2108>: stw r9,0(r30) (gdb) p /x (ptr)@4 $9 = {0xcd, 0xcc, 0xa8, 0xc1} 0x00000000100a11f4 <pack_single+2112>: li r3,0 0x00000000100a11f8 <pack_single+2116>: lfd f31,216(r1) 0x00000000100a11fc <pack_single+2120>: ld r30,200(r1) The first byte is 0xcd: GOOD.
msg333777 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-01-16 17:12
This is a performance sensitive function, so I prefer not to add volatile. MSVC also had a bug with that function, but only in PGO mode. Microsoft has fixed the issue long ago. Do newer gcc versions have this issue? I'm fine with wrapping the entire macro in an #ifdef for __ppc64__. Who is using this platform? IBM Power9?
msg333778 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-01-16 17:18
> This is a performance sensitive function, so I prefer not to add volatile. I don't suggest to use volatile. I'm just pointing to the function causing the bug and I said that volatile worked for me :-) I made my tests on: * Red Hat Enterprise Linux release 8.0 Beta (Ootpa) * gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3) * Linux kernel 4.18.0-56.el8.ppc64le * uname -p: ppc64le
msg333779 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-01-16 17:36
Does it also work with -fno-inline, just for ppc64? There are other places in the sources where an inlined memcpy() could be miscompiled.
msg333782 - (view)	Author: Charalampos Stratakis (cstratak) *	Date: 2019-01-16 18:33
Possibly relevant: https://fedoraproject.org/wiki/Changes/PPC64LE_Float128_Transition#Detailed_Description But the work is not complete.
msg333847 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-01-17 10:40
If gcc-8.0.1-0.14.fc28.ppc64le miscompiles memcpy(), perhaps the upstream priority in https://bugzilla.redhat.com/show_bug.cgi?id=1540995 should be "release blocker". CC David Edelsohn, whose PPC64 buildbot (presumably big endian) works.
msg333862 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-01-17 12:39
Extract of memoryview pack_signal() function: #define PACK_SINGLE(ptr, src, type) \ do { \ type x; \ x = (type)src; \ memcpy(ptr, (char )&x, sizeof x); \ } while (0) / floats / case 'f': case 'd': d = PyFloat_AsDouble(item); if (d == -1.0 && PyErr_Occurred()) goto err_occurred; if (fmt[0] == 'f') { PACK_SINGLE(ptr, d, float); // ##### BUG IS HERE ### } else { PACK_SINGLE(ptr, d, double); } break; Pseudo-code: double d = PyFloat_AsDouble(item); float x; x = (float)d; memcpy(ptr, (char )&x, sizeo x);
msg333866 - (view)	Author: Florian Weimer (fweimer)	Date: 2019-01-17 13:27
I believe this is a GCC bug, and filed <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88892>.
msg333876 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-01-17 14:36
> If gcc-8.0.1-0.14.fc28.ppc64le miscompiles memcpy(), perhaps the upstream priority in https://bugzilla.redhat.com/show_bug.cgi?id=1540995 should be "release blocker". With the bug, memoryview doesn't round properly float: rounds to zero rather than rounding to nearest. It's not a critical crash or critical security vulnerability :-) Anyway, I wrote a fix: PR 11593. My fix is to workaround the GCC compiler bug using volatile but only on ppc64. My change only impacts the 'f' type on ppc64 when compiled by GCC. Other architectures, compilers and types are unaffected and so have no impact on performances.
msg333877 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-01-17 14:44
The bug occurs when GCC emits a single instruction (stxsspx) for cast + memcpy. I understand that the struct module is not affected because there is a condition branch (if) between the cast ("float x = ...") and the memcpy(). static int np_float(char p, PyObject v, const formatdef f) { float x = (float)PyFloat_AsDouble(v); if (x == -1 && PyErr_Occurred()) { PyErr_SetString(StructError, "required argument is not a float"); return -1; } memcpy(p, (char )&x, sizeof x); return 0; }
msg333878 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-01-17 14:59
Okay, so it's not a severe bug. That leaves us with the question what to do about it. As I said above, other call sites could be affected, too: _struct.c: static int np_float(char p, PyObject v, const formatdef f) { float x = (float)PyFloat_AsDouble(v); if (x == -1 && PyErr_Occurred()) { PyErr_SetString(StructError, "required argument is not a float"); return -1; } memcpy(p, (char )&x, sizeof x); return 0; } Even if this one isn't (I can't test) I think I'd prefer a more global fix for the affected gcc version. Does any switch like -fno-inline or something else work? If so, we could detect the issue in configure.ac.
msg333879 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-01-17 15:00
Our mails apparently crossed. :-)
msg333880 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-01-17 15:07
Or to put it differently, should we put a specific fix for a single gcc version into memoryview.c? For all we know, a fix will be pushed to Fedora 28 in the next 4 months.
msg333881 - (view)	Author: Florian Weimer (fweimer)	Date: 2019-01-17 15:15
We don't know yet if the GCC bug is specific to POWER. That depends on what causes it. Other targets my have double-to-float conversion instructions which hard-code the wrong rounding mode.
msg333882 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-01-17 15:20
A colleague working on clang asked me to test clang: no, clang doesn't have the bug. test_buffer pass as expected with clang -O3. Machine code of the cast + memcpy: (gdb) p $f31 $1 = -21.100000000000001 (gdb) disassemble $pc,$pc+40 => 0x0000000010078fbc <pack_single+380>: frsp f0,f31 0x0000000010078fc0 <pack_single+384>: li r3,0 0x0000000010078fc4 <pack_single+388>: stfsx f0,0,r29 (gdb) stepi 0x0000000010078fc0 1824 PACK_SINGLE(ptr, d, float); (gdb) p $f0 $3 = -21.100000381469727 (gdb) stepi 0x0000000010078fc4 1824 PACK_SINGLE(ptr, d, float); (gdb) stepi 0x0000000010078fc8 1824 PACK_SINGLE(ptr, d, float); (gdb) p /x (*ptr)@4 $8 = {0xcd, 0xcc, 0xa8, 0xc1} The first byte is 0xcd: GOOD. Florian explained in the GCC bug report that "frsp" is needed and clang emits it. "This is incorrect because stfs rounds to zero. An frsp instruction is missing before the stfs (and would be emitted without the memcpy)." https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88892#c0
msg333884 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-01-17 15:26
"Or to put it differently, should we put a specific fix for a single gcc version into memoryview.c? For all we know, a fix will be pushed to Fedora 28 in the next 4 months." Right now, test_buffer is skipped in the python3.spec of Fedora (recipe to build the package which compiles Python and then runs the Python test suite) on pp64le because of this bug. With my packager hat: I'm done, the bug has been identified and is already worked around in the package. With my Python hat: I'm not sure if I'm comfortable to have a known rounding issue on memoryview because of a compiler bug. I'm not sure that the GCC bug will be fixed quickly, nor if all operating systems will quickly update GCC or backport the fix. Well, I wrote a fix, but I don't feel able to decide if the workaround is worth it or not. For me, the main risk is to forget to remove the workaround once a new GCC version will be released (in N months). I wouldn't bet that such GCC bug will be fixed and released in less than 4 months.
msg333886 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-01-17 15:44
> For me, the main risk is to forget to remove the workaround once a new GCC version will be released (in N months). This is why I suggested using the reproducer in configure.ac, setting something like HAVE_GCC_MEMCPY_ROUNDING_BUG and then either a) wrap the entire PACK_SINGLE() macro in an #ifdef or b) preferably add a global flag like -fno-inline or any other flag that prevents the issue to CFLAGS. b) under the condition that any such flag exists. I may do a) in the weekend, which also addresses Florian Weimer's observation that it is not known whether the issue is limited to POWER.
msg333887 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2019-01-17 16:31
Would it help if define PACK_SINGLE as below? #define PACK_SINGLE(ptr, src, type) \ do { \ union { \ type value; \ unsigned char raw[sizeof(type)]; \ } x; \ x.value = (type)src; \ memcpy(ptr, x.raw, sizeof(type)); \ } while (0)
msg333888 - (view)	Author: Florian Weimer (fweimer)	Date: 2019-01-17 17:08
No, GCC will optimize away the union.
msg334582 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-01-30 17:41
The GCC bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88892 has been fixed in the development branch ("trunk"): https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=268083 I requested a fix for GCC 8.2.
msg338383 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-03-19 16:41
The root issue was a bug in GCC which has been fixed. Fedora Rawhide got the new fixed GCC and so I close the issue.

History
Date	User	Action	Args
2022-04-11 14:59:10	admin	set	github: 79933
2019-03-19 16:41:30	vstinner	set	status: open -> closed resolution: third party messages: + msg338383 stage: patch review -> resolved
2019-01-30 17:41:11	vstinner	set	messages: + msg334582
2019-01-18 20:54:12	skrah	set	assignee: skrah
2019-01-17 17:08:37	fweimer	set	messages: + msg333888
2019-01-17 16:31:37	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg333887
2019-01-17 15:44:57	skrah	set	messages: + msg333886
2019-01-17 15:26:23	vstinner	set	messages: + msg333884
2019-01-17 15:20:47	vstinner	set	messages: + msg333882
2019-01-17 15:15:49	fweimer	set	messages: + msg333881
2019-01-17 15:07:55	skrah	set	messages: + msg333880
2019-01-17 15:00:28	skrah	set	messages: + msg333879
2019-01-17 14:59:12	skrah	set	messages: + msg333878
2019-01-17 14:44:14	vstinner	set	messages: + msg333877
2019-01-17 14:36:57	vstinner	set	messages: + msg333876
2019-01-17 14:28:14	vstinner	set	stage: patch review pull_requests: + pull_request11293
2019-01-17 14:28:04	vstinner	set	stage: (no value) pull_requests: + pull_request11292
2019-01-17 14:27:54	vstinner	set	stage: (no value) pull_requests: + pull_request11291
2019-01-17 13:27:44	fweimer	set	messages: + msg333866
2019-01-17 12:43:38	fweimer	set	nosy: + fweimer
2019-01-17 12:39:01	vstinner	set	messages: + msg333862
2019-01-17 10:40:52	skrah	set	nosy: + David.Edelsohn messages: + msg333847
2019-01-16 18:33:58	cstratak	set	messages: + msg333782
2019-01-16 17:36:33	skrah	set	messages: + msg333779
2019-01-16 17:18:12	vstinner	set	messages: + msg333778
2019-01-16 17:12:19	skrah	set	messages: + msg333777
2019-01-16 17:06:30	cstratak	set	nosy: + cstratak
2019-01-16 16:51:34	vstinner	set	files: + pack_single.patch keywords: + patch
2019-01-16 16:51:18	vstinner	create