classification
Title: aarch64 RHEL7 LTO + PGO 3.7: "make" hangs when running test_asyncio
Type: Stage: resolved
Components: Tests Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: vstinner
Priority: normal Keywords: patch

Created on 2021-01-26 13:46 by vstinner, last changed 2021-01-27 10:20 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 24339 merged vstinner, 2021-01-26 13:51
Messages (5)
msg385712 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-26 13:46
test_asyncio hangs randomly on Python 3.7 on aarch64 RHEL7 LTO + PGO 3.7. The symptom is a failed build failing with:

   retry lost connection compile (retry)

Full error:

   remoteFailed: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion.

Example of failed build:

   https://buildbot.python.org/all/#/builders/42/builds/4703

Configure command:

   ./configure --prefix '$(PWD)/target' --with-lto --enable-optimizations

Compile command:

   make -j10 all

End of the make output:
--------
make run_profile_task
make[1]: Entering directory `/home/buildbot/buildarea/3.7.cstratak-RHEL7-aarch64.lto-pgo/build'
./python -m test.regrtest --pgo || true
0:00:00 load avg: 1.07 Run tests sequentially
0:00:00 load avg: 1.07 [  1/416] test_grammar
(...)
0:01:40 load avg: 1.07 [ 24/416] test_asynchat
0:01:42 load avg: 1.07 [ 25/416] test_asyncio

remoteFailed: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.error.ConnectionLost'>: Connection to the other side was lost in a non-clean fashion.
]
--------

I ran manually "./configure --prefix '$(PWD)/target' --with-lto --enable-optimizations && make -j10 all" twice on the worker directly, but test_asyncio passed successfully (I tried it twice).
msg385716 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-26 13:54
Python 3.7 no longer accept bugfixes, only security fixes:
https://devguide.python.org/#status-of-python-branches

Maybe we should just remove this PGO worker (but keep other Python 3.7 workers).
msg385724 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-26 16:19
I created https://github.com/python/buildmaster-config/pull/228 to skip PGO builds on Python 3.7, since test_asyncio hangs on PGO builds.
msg385752 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-27 10:16
New changeset 6790005a9a30ae3eca69d1957fb072171643a366 by Victor Stinner in branch 'master':
bpo-43031: Set a timeout when running tests in PGO build (GH-24339)
https://github.com/python/cpython/commit/6790005a9a30ae3eca69d1957fb072171643a366
msg385753 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-01-27 10:20
I close the issue.

I added a timeout (20 min) in master when running tests for a PGO build.

> I created https://github.com/python/buildmaster-config/pull/228 to skip PGO builds on Python 3.7, since test_asyncio hangs on PGO builds.

Done. I remove all Python 3.7 PGO buildbot builders.

The test_asyncio issue on Python 3.7 looks like a race condition which is likely already fixed in Python 3.8. Even if it's a bug, we cannot fix asyncio bugs in Python 3.7 anymore.
History
Date User Action Args
2021-01-27 10:20:17vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg385753

stage: patch review -> resolved
2021-01-27 10:16:30vstinnersetmessages: + msg385752
2021-01-26 16:19:44vstinnersetmessages: + msg385724
2021-01-26 13:54:04vstinnersetmessages: + msg385716
2021-01-26 13:51:19vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request23158
2021-01-26 13:46:19vstinnercreate