This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Python fails to compile in the Fedora Stable LTO buildbots
Type: compile error Stage: resolved
Components: Build Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: cstratak, hroncok, pablogsal, petr.viktorin, vstinner
Priority: normal Keywords:

Created on 2020-10-26 22:34 by pablogsal, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (8)
msg379696 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-10-26 22:34
I have been trying to diagnose this failure:

https://buildbot.python.org/all/#/builders/271/builds/710/steps/3/logs/stdio

it happens on these buildbots:

x86_64 fedora stable
ppc64le fedora stable (so 32 now)

It seems that CPython cannot be compiled with --with-lto regardless of the version:

https://buildbot.python.org/all/#/builders/336/builds/2145
https://buildbot.python.org/all/#/builders/426/builds/641
https://buildbot.python.org/all/#/builders/294/builds/986

This seems to indicate that something has changed in these buildbots somehow. Maybe the gcc installation is broken?

In my investigation, it seems that Python/compile.o is miscompiled.For example

FEDORA BUILDBOT with LTO:

[buildbot@python-builder2-rawhide cpython]$ nm Python/compile.o  | grep _Py_Mangle
In function ‘assemble_lnotab’,
    inlined from ‘assemble_emit’ at Python/compile.c:5696:25,
    inlined from ‘assemble’ at Python/compile.c:6038:18:
Python/compile.c:5650:19: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
 5650 |         *lnotab++ = k;
      |                   ^
         U _Py_Mangle


MY ARCH LINUX SYSTEM:

❯ nm Python/compile.o  | grep _Py_Mangle
00000000 T _Py_Mangle

It seems that the _Py_Mangle is not included in the object file. Is this a gcc bug? I have not been able to diagnose exactly where does this problem. It seems that the gcc version is "10.2.1" but I can correctly build CPython with LTO in my arch Linux machine with gcc 10.2.0.

Given that these are stable buildbots, could you investigate what is going on or report this to the gcc folks ar RedHat/Fedora? 

----

More interesting data:

Compiling with -O0 does not have a problem, but doing it with -O3 does.

With -O0:

[buildbot@python-builder2-rawhide cpython]$ nm Python/compile.o  | grep _Py_Mangle
00000000 T _Py_Mangle

With -O3:

[buildbot@python-builder2-rawhide cpython]$ nm Python/compile.o  | grep _Py_Mangle
         U _Py_Mangle
msg379701 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-27 00:02
> I have been trying to diagnose this failure:
> https://buildbot.python.org/all/#/builders/271/builds/710/steps/3/logs/stdio

This is the "AMD64 Fedora Stable LTO 3.x" worker. The latest successful build was build 684, finished 6 days ago. test.pythoninfo of build 684:

CC.version: gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1)
platform.libc_ver: glibc 2.31
platform.platform: Linux-5.8.14-200.fc32.x86_64-x86_64-with-glibc2.31
sys.version: 3.10.0a1+ (heads/master:c0f22fb8b3, Oct 21 2020, 00:02:36)  [GCC 10.2.1 20200723 (Red Hat 10.2.1-1)]
sysconfig[CCSHARED]: -fPIC
sysconfig[CC]: gcc -pthread
sysconfig[CFLAGS]: -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall
sysconfig[OPT]: -DNDEBUG -g -fwrapv -O3 -Wall
sysconfig[PY_CFLAGS]: -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall
sysconfig[PY_CFLAGS_NODIST]: -flto -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -g -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal
sysconfig[PY_CORE_LDFLAGS]: -flto -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -g
sysconfig[PY_LDFLAGS_NODIST]: -flto -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -g
sysconfig[PY_STDMODULE_CFLAGS]: -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -flto -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -g -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden -I./Include/internal -I. -I./Include


It started to fail at October 21 (build 685). At 2020-10-21, GCC package was upgraded from version 10.2.1-1 to 10.2.1-5 to gcc-10.2.1-5 according to /var/log/dnf.log:

2020-10-21T06:58:01-0400 SUBDEBUG drpm: spawned 2243855: /usr/bin/applydeltarpm -a x86_64 /var/cache/dnf/updates-b3cb4614b6495970/packages/gcc-10.2.1-1.fc32_10.2.1-5.fc32.x86_64.drpm /var/cache/dnf/updates-b3cb4614b6495970/packages/gcc-10.2.1-5.fc32.x86_64.rpm

Package changelog between 10.2.1-1 and 10.2.1-5:
---
* Mon Oct 05 2020 Jakub Jelinek <jakub@redhat.com> 10.2.1-5
- update from releases/gcc-10 branch
  - PRs bootstrap/97163, bootstrap/97183, c++/96994, c++/97145, c++/97195,
	fortran/93423, fortran/95614, fortran/96041, gcov-profile/64636,
	gcov-profile/96913, gcov-profile/97069, gcov-profile/97193,
	libstdc++/94160, libstdc++/94681, libstdc++/96803, libstdc++/97101,
	libstdc++/97167, middle-end/95464, middle-end/97054, middle-end/97073,
	preprocessor/96935, target/71233, target/96683, target/96795,
	target/96827, target/97166, target/97184, target/97231, target/97247,
	tree-optimization/96979, tree-optimization/97053

* Wed Sep 16 2020 Jakub Jelinek <jakub@redhat.com> 10.2.1-4
- update from releases/gcc-10 branch
  - PRs bootstrap/96203, c++/95164, c++/96862, c++/96901, d/96157, d/96924,
	debug/93865, debug/94235, debug/96729, fortran/94690, fortran/95109,
	fortran/95398, fortran/95882, fortran/96859, libstdc++/71960,
	libstdc++/92978, libstdc++/96766, libstdc++/96851, lto/94311,
	middle-end/87256, middle-end/96369, target/85830, target/94538,
	target/96357, target/96551, target/96574, target/96744, target/96808,
	target/97028, tree-optimization/88240, tree-optimization/96349,
	tree-optimization/96370, tree-optimization/96514,
	tree-optimization/96522, tree-optimization/96579,
	tree-optimization/96597, tree-optimization/96820,
	tree-optimization/96854, tree-optimization/97043
- fix up ARM target attribute/pragma handling (#1875814, PR target/96939)
- don't ICE on sp clobbers with -mincoming-stack-boundary=2 on ia32
  (#1862029, PR target/97032)

* Wed Aug 26 2020 Jakub Jelinek <jakub@redhat.com> 10.2.1-3
- update from releases/gcc-10 branch
  - PRs c++/95428, c++/96082, c++/96106, c++/96164, c++/96199, c++/96497,
	c/96545, c/96549, c/96571, d/96250, d/96254, d/96301, debug/96354,
	fortran/93553, fortran/96312, fortran/96486, ipa/95320, ipa/96291,
	ipa/96482, libstdc++/89760, libstdc++/95749, libstdc++/96303,
	libstdc++/96484, libstdc++/96718, lto/95362, lto/95548,
	middle-end/96426, middle-end/96459, target/93897, target/95450,
	target/96191, target/96243, target/96446, target/96493, target/96506,
	target/96525, target/96530, target/96536, target/96562, target/96682,
	tree-optimization/96483, tree-optimization/96535,
	tree-optimization/96722, tree-optimization/96730,
	tree-optimization/96758
- mangle some further symbols needed for debug info during early dwarf
  (#1862029, PR debug/96690)
- during %check perform tests whether annobin is usable with the newly built
  compiler or whether it might need to be rebuilt
- disable graphite for ELN

* Tue Aug 04 2020 Jakub Jelinek <jakub@redhat.com> 10.2.1-2
- update from releases/gcc-10 branch
  - PRs c++/95591, c++/95599, c++/95823, c++/95824, c++/95895, c/96377,
	d/96140, fortran/89574, fortran/93567, fortran/93592, fortran/95585,
	fortran/95612, fortran/95980, fortran/96018, fortran/96086,
	fortran/96220, fortran/96319, lto/45375, middle-end/96335,
	target/95435, target/96190, target/96236, target/96260, target/96402,
	tree-optimization/96058
- emit debug info for C/C++ external function declarations used in the TU
  (PR debug/96383)
- discard SHN_UNDEF global symbols from LTO debuginfo (PR lto/96385)
- strip also -flto=auto from optflags
---

Copied from: https://koji.fedoraproject.org/koji/buildinfo?buildID=1626341


Currently, the buildbot worker uses gcc-10.2.1-6.fc32.x86_64.
msg379703 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-27 00:16
I reproduced the bug on Fedora 32 with gcc-10.2.1-6.fc32.x86_64 (new) but I failed to reproduce with gcc-10.2.1-1.fc32.x86_64 (old, before I upgraded GCC). So it's a regression of gcc-10.2.1-6.fc32.x86_64 package.

The package contains multiple downstream patches:

   https://src.fedoraproject.org/rpms/gcc/tree/f32

Commands used to reproduce the issue:

   export MAKEFLAGS=-j10
   ./configure --with-lto 
   make

Extract of my Makefile:

OPT=		-DNDEBUG -g -fwrapv -O3 -Wall
BASECFLAGS=	 -Wno-unused-result -Wsign-compare
CONFIGURE_CFLAGS_NODIST= -flto -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -g -std=c99 -Wextra -Wno-unused-result -Wno-unused-parameter -Wno-missing-field-initializers -Werror=implicit-function-declaration -fvisibility=hidden
CONFIGURE_LDFLAGS_NODIST= -flto -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -g
msg379704 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-27 00:28
With LTO, compile.o requires an undefined _Py_Mangle symbol:

$ gcc -pthread -c -DNDEBUG -fwrapv -O3 -std=c99 -fvisibility=hidden -flto -I./Include/internal  -I. -I./Include -DPy_BUILD_CORE -o Python/compile.o Python/compile.c; nm Python/compile.o | grep _Py_Mangle

         U _Py_Mangle


Without LTO, compile.o defines _Py_Mangle symbol:

$ gcc -pthread -c -DNDEBUG -fwrapv -O3 -std=c99 -fvisibility=hidden -I./Include/internal  -I. -I./Include -DPy_BUILD_CORE -o Python/compile.o Python/compile.c; nm Python/compile.o | grep _Py_Mangle

0000000000003c20 T _Py_Mangle
msg379705 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-10-27 00:29
> With LTO, compile.o requires an undefined _Py_Mangle symbol:


Yeah, now try to compile with LTO and CFLAGS='-O0'
msg379707 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-27 01:00
> Yeah, now try to compile with LTO and CFLAGS='-O0'

Using LTO:

* "-O1 -finline-functions -finline-small-functions -fpartial-inlining" reproduces the issue.
* "-O1" does not reproduce the issue.
msg379708 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-27 01:09
I reported the issue to Fedora: https://bugzilla.redhat.com/show_bug.cgi?id=1891657
msg384478 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-06 09:35
Buildbots are back to green. It seems like the bug is gone with "gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9)".
History
Date User Action Args
2022-04-11 14:59:37adminsetgithub: 86330
2021-01-06 09:35:43vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg384478

stage: needs patch -> resolved
2020-10-27 01:09:01vstinnersetmessages: + msg379708
2020-10-27 01:00:46vstinnersetmessages: + msg379707
2020-10-27 00:29:47pablogsalsetmessages: + msg379705
2020-10-27 00:28:09vstinnersetmessages: + msg379704
2020-10-27 00:16:04vstinnersetmessages: + msg379703
2020-10-27 00:02:37vstinnersetmessages: + msg379701
2020-10-26 22:34:47pablogsalcreate