classification
Title: test_io: test_daemon_threads_shutdown_stdout_deadlock() fails on x86 Windows7 3.x
Type: Stage: resolved
Components: Tests Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: db3l, eric.snow, pablogsal, vstinner
Priority: normal Keywords:

Created on 2019-03-04 08:20 by vstinner, last changed 2019-03-06 01:36 by vstinner. This issue is now closed.

Messages (6)
msg337073 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 08:20
I don't recall this failure previously, it looks like a regression.

https://buildbot.python.org/all/#/builders/58/builds/2001

======================================================================
FAIL: test_daemon_threads_shutdown_stdout_deadlock (test.test_io.CMiscIOTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_io.py", line 4138, in test_daemon_threads_shutdown_stdout_deadlock
    self.check_daemon_threads_shutdown_deadlock('stdout')
  File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_io.py", line 4129, in check_daemon_threads_shutdown_deadlock
    self.assertIn("Fatal Python error: could not acquire lock "
AssertionError: "Fatal Python error: could not acquire lock for <_io.BufferedWriter name='<stdout>'> at interpreter shutdown, possibly due to daemon threads" not found in ''

--

Copy of David Bolen's (the buildbot worker owner) message on bpo-33608, msg336897:

"""
I suspect changes for this issue may be creating test_io failures on my windows builders, most notably my x86 Windows 7 builder where test_io has been failing both attempts on almost all builds.  It fails with a lock failure during interpreter shutdown, and commit ef4ac967 appears the most plausible commit out of the set introduced at the first failing build on Feb 24.

See https://buildbot.python.org/all/#/builders/58/builds/1977 for the first failure.  test_io has failed both attempts on all but 3 of the subsequent 16 tests of the 3.x branch.

It might be worth noting that this builder is slow, so if there are timeouts involved or a race condition of any sort it might be triggering it more readily than other builders.  I do, however, see the same failures - albeit less frequently and not usually on both tries - on the Win8 and Win10 builders.

For what it's worth one other oddity is that while having test_multiprocessing* failures are relatively common on the Win7 builder during the first round of tests due to exceeding the timeout, they usually pass on the retry, but over this same time frame have begun failing - not due to timeout - even on the second attempt, which is unusual.  It might be coincidental but similar failures are also showing up sporadically on the Win8/Win10 builders where such failures are not common at all.
"""
msg337074 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 08:21
Oh, David also wrote:

"""
If I can help with testing or builder access or anything just let me know.  It appears that I can pretty reliably trigger the error through just the individual tests (test_daemon_threads_shutdown_std{out,err}_deadlock) in isolation, which is definitely less tedious than a full buildbot run to test a change.

BTW, while not directly related since it was only just merged, and I won't pretend to have any real understanding of the changes here, I do have a question about PR 12024 ... it appears HEAD_UNLOCK is used twice in _PyInterpreterState_LookUpID.  Shouldn't one of those be HEAD_LOCK?
"""

https://bugs.python.org/issue33608#msg336975
msg337109 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 13:11
It seems like the test started to fail in this build (at Feb 25):
https://buildbot.python.org/all/#/builders/58/builds/1977
msg337112 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 13:13
It seems like this issue is a regression caused by bpo-33608.
msg337219 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-03-05 16:14
This is resolved with gh-12159, no?
msg337263 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-06 01:36
> This is resolved with gh-12159, no?

I was waiting to see if buildbot workers feel better. It's the case, so I close the issue.
History
Date User Action Args
2019-03-06 01:36:39vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg337263

stage: resolved
2019-03-05 16:14:44eric.snowsetmessages: + msg337219
2019-03-04 13:13:10vstinnersetmessages: + msg337112
2019-03-04 13:11:01vstinnersetmessages: + msg337109
2019-03-04 08:21:36vstinnersetmessages: + msg337074
2019-03-04 08:20:44vstinnercreate