This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test.libregrtest: Race condition in runtest_mp leads to hangs (never exits)
Type: behavior Stage: resolved
Components: Tests Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: colesbury, corona10, miss-islington, vstinner
Priority: normal Keywords: patch

Created on 2021-12-30 17:12 by colesbury, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 30470 merged colesbury, 2022-01-07 22:09
PR 30523 merged miss-islington, 2022-01-11 03:03
PR 30524 merged miss-islington, 2022-01-11 03:03
Messages (5)
msg409374 - (view) Author: Sam Gross (colesbury) * (Python triager) Date: 2021-12-30 17:12
The runtest_mp.py has a race condition between checking for worker.is_alive() and processing the queue that can lead to indefinite hangs.

The hang happens when the all the results from the self.output queue are processed but at least one of the workers hasn't finished exiting.

https://github.com/python/cpython/blob/8d7644fa64213207b8dc6f555cb8a02bfabeced2/Lib/test/libregrtest/runtest_mp.py#L394-L418

The main thread tries to get a result from the output queue, but the queue is empty and remains empty. Although the queue.get() operation eventually times out (after 30 seconds), the main thread does not re-check if all the workers have exited (!), but instead retries the queue.get() in the "while True" loop.

https://github.com/python/cpython/blob/8d7644fa64213207b8dc6f555cb8a02bfabeced2/Lib/test/libregrtest/runtest_mp.py#L415-L418

To reproduce, apply the below patch which introduces a small delay to more reliably trigger the hang.

curl "https://gist.githubusercontent.com/colesbury/fe3769f43dfb724c86ecbb182b1f6749/raw/e29a4eaeebb8d5252cdd66f3f8a70f7bc5fa14e7/runtest_mp.diff" | patch -p1
./python -m test test_regrtest -m test_module_from_test_autotest -v
msg409740 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-01-05 09:19
Do you want to work on a fix?
msg410271 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2022-01-11 03:03
New changeset e13cdca0f5224ec4e23bdd04bb3120506964bc8b by Sam Gross in branch 'main':
bpo-46205: exit if no workers are alive in runtest_mp (GH-30470)
https://github.com/python/cpython/commit/e13cdca0f5224ec4e23bdd04bb3120506964bc8b
msg410272 - (view) Author: miss-islington (miss-islington) Date: 2022-01-11 03:29
New changeset e0ec08dc49f8e6f94a735bc9946ef7a3fd898a44 by Miss Islington (bot) in branch '3.10':
bpo-46205: exit if no workers are alive in runtest_mp (GH-30470)
https://github.com/python/cpython/commit/e0ec08dc49f8e6f94a735bc9946ef7a3fd898a44
msg410273 - (view) Author: miss-islington (miss-islington) Date: 2022-01-11 03:32
New changeset 690ed889c537c008a2c5f3e6c4f06c5b0c0afbc6 by Miss Islington (bot) in branch '3.9':
bpo-46205: exit if no workers are alive in runtest_mp (GH-30470)
https://github.com/python/cpython/commit/690ed889c537c008a2c5f3e6c4f06c5b0c0afbc6
History
Date User Action Args
2022-04-11 14:59:54adminsetgithub: 90363
2022-01-11 03:32:45corona10setstatus: open -> closed
stage: patch review -> resolved
2022-01-11 03:32:39corona10setversions: - Python 3.8
2022-01-11 03:32:24miss-islingtonsetmessages: + msg410273
2022-01-11 03:29:39miss-islingtonsetmessages: + msg410272
2022-01-11 03:03:46corona10setmessages: + msg410271
2022-01-11 03:03:36miss-islingtonsetpull_requests: + pull_request28725
2022-01-11 03:03:32miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request28724
2022-01-09 05:26:08corona10setnosy: + corona10
2022-01-07 22:09:49colesburysetkeywords: + patch
stage: patch review
pull_requests: + pull_request28673
2022-01-05 09:19:50vstinnersetcomponents: + Tests
title: Race condition in runtest_mp leads to hangs (never exits) -> test.libregrtest: Race condition in runtest_mp leads to hangs (never exits)
2022-01-05 09:19:38vstinnersetmessages: + msg409740
2021-12-30 17:17:54colesburysetnosy: + vstinner
2021-12-30 17:13:13colesburysettype: behavior
2021-12-30 17:12:53colesburycreate