classification
Title: AIX: makexp_aix, parallel build (failures) and ld WARNINGS
Type: behavior Stage: patch review
Components: Build Versions: Python 3.10, Python 3.9, Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: BTaskaya, Michael.Felt, kadler
Priority: normal Keywords: patch

Created on 2020-04-28 16:20 by Michael.Felt, last changed 2020-07-05 15:28 by Michael.Felt.

Files
File name Uploaded Description Edit
python3-makexp_aix.patch kadler, 2020-06-15 17:04
Pull Requests
URL Status Linked Edit
PR 19759 open Michael.Felt, 2020-04-28 16:29
Messages (6)
msg367541 - (view) Author: Michael Felt (Michael.Felt) * Date: 2020-04-28 16:19
Currently, on AIX, whenever the -j option is passed to make there are many WARNINGS from the loader (ld) re: duplicate symbols.

While it is not possible to eliminate these warnings completely - as some are not related to the Python build, but external (3rd party) packaging - MOST of these warnings can be eliminated by ensuring that the export file creation completes before additional steps try to use it.

By adding a small test to see if the export file is in the process of being made - and waiting for that to finish - the messages "go away".

The PR that is being proposed only affects AIX (a script named makeaix_exp). The script has not been modified in 22 years - so I guess the -j option is something that showed up after 1998 :)

I know it is not perfect - but removes a tremendous amount of noise - most of the time.

Michael

p.s. requesting backport to 3.8 so all buildbots benefit.
msg371258 - (view) Author: Michael Felt (Michael.Felt) * Date: 2020-06-11 08:18
specifically, makexp_aix - from 1998-1999 - did not consider parallelization.

make -j2 is sufficient to create the following issue - that frequently leads to a failed compile/build.

./Modules/makexp_aix Modules/python.exp . libpython3.9d.a;  gcc -pthread     -Wl,-bE:Modules/python.exp -lld -o python Programs/python.o libpython3.9d.a -lintl -ldl  -lm   -lm 
./Modules/makexp_aix Modules/python.exp . libpython3.9d.a;  gcc -pthread     -Wl,-bE:Modules/python.exp -lld -o Programs/_testembed Programs/_testembed.o libpython3.9d.a -lintl -ldl  -lm   -lm 
ld: 0711-418 ERROR: Import or export file Modules/python.exp at line 2:
	A symbol name may only be followed by an export attribute
	or an address. The line is being ignored.
ld: 0711-415 WARNING: Symbol PyAST_Check is already exported.
ld: 0711-415 WARNING: Symbol PyAST_Compile is already exported.
ld: 0711-415 WARNING: Symbol PyAST_CompileEx is already exported.
ld: 0711-415 WARNING: Symbol PyAST_CompileObject is already exported.
...
Over 4000 lines of warnings later:
ld: 0711-415 WARNING: Symbol _Py_write is already exported.
ld: 0711-415 WARNING: Symbol _Py_write_noraise is already exported.
collect2: error: ld returned 8 exit status
Makefile:598: recipe for target 'python' failed
make: *** [python] Error 1
program finished with exit code 2

Explanation: makexp_aix is running in parallel - and writing to python.exp in parallel.

The patch/PR "tames" this - and, hopefully, multiple "fails" per day, of the AIX bots will cease.

p.s. needed in 3.8, 3.9 and the new master (3.10)
msg371553 - (view) Author: Kevin (kadler) * Date: 2020-06-15 14:34
This seems to be a duplicate of https://bugs.python.org/issue19521

The PR for that one seems a little less hacky since it uses make rules to prevent duplication instead of lock files.
msg371573 - (view) Author: Michael Felt (Michael.Felt) * Date: 2020-06-15 16:50
Yes, it is less hacky - and something to pursue later - as a better
solution. Even the idea of perhaps no longer needing makexp_aix and/or
ld_so_aix and python.exp is much better solution.

However, the goal of this PR is to have something now - that removes the
pain (e.g., false bot failures and bot report storage impact) asap.

Many thanks for looking - and commenting!

On 15/06/2020 16:34, Kevin wrote:
> Kevin <kadler@us.ibm.com> added the comment:
>
> This seems to be a duplicate of https://bugs.python.org/issue19521
>
> The PR for that one seems a little less hacky since it uses make rules to prevent duplication instead of lock files.
>
> ----------
> nosy: +kadler
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue40424>
> _______________________________________
>
msg371576 - (view) Author: Kevin (kadler) * Date: 2020-06-15 17:04
FYI, here's a patch we've been using with our builds on PASE (an AIX compatibility layer on the IBM i OS). It runs all the echos and nm in a sub-shell so that all the output appears as a continuous stream instead of 3 separate open/write/close events.

There's still a race condition, but since it no longer appends, the last one in will win instead of the mixed result there is now. AFAICT, it gets created much earlier than it gets used so nothing _should_ be reading it while the writers are racing. At least it works for us on PASE with -j16 when building Python 3.6.
msg373033 - (view) Author: Michael Felt (Michael.Felt) * Date: 2020-07-05 15:28
Thanks Kevin.

I took your patch (added your name to blurb as well).

Only difference was to remove Qsystem (or something), from the pathnames.
History
Date User Action Args
2020-07-05 15:28:06Michael.Feltsetmessages: + msg373033
versions: + Python 3.7
2020-06-15 17:04:20kadlersetfiles: + python3-makexp_aix.patch

messages: + msg371576
2020-06-15 16:50:41Michael.Feltsetmessages: + msg371573
2020-06-15 14:34:19kadlersetnosy: + kadler
messages: + msg371553
2020-06-11 14:28:55BTaskayasetnosy: + BTaskaya
2020-06-11 08:19:46Michael.Feltsettitle: AIX: parallel build and ld WARNINGS -> AIX: makexp_aix, parallel build (failures) and ld WARNINGS
2020-06-11 08:18:55Michael.Feltsetmessages: + msg371258
versions: + Python 3.10
2020-04-28 16:29:50Michael.Feltsetkeywords: + patch
stage: patch review
pull_requests: + pull_request19081
2020-04-28 16:20:00Michael.Feltcreate