classification
Title: subprocess deadlock when read() is interrupted
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: jszakmeister, sbt
Priority: normal Keywords: patch

Created on 2013-03-06 12:07 by jszakmeister, last changed 2013-03-06 14:35 by sbt. This issue is now closed.

Files
File name Uploaded Description Edit
fix-subprocess-deadlock.patch jszakmeister, 2013-03-06 12:07 review
Messages (5)
msg183584 - (view) Author: John Szakmeister (jszakmeister) * Date: 2013-03-06 12:07
I discovered this issue while trying to track down why our upcoming release for Nose 1.3.0 was deadlocking under Ubuntu 12.04 with Python 3.3.  It turns out that the read() was being interrupted leaving data in the subprocess's output buffers, which ultimately means the subprocess is blocked.  Since the thread was exiting, and the read was never retried, we were left in deadlock.

The attached patch fixes the issue.  It wraps the read call with _eintr_retry_call() around the read operation in _readerthread().
msg183585 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-03-06 12:30
The change in your patch is in a Windows-only section -- a few lines before the chunk you can see _winapi.GetExitCodeProcess().

Since read() on Windows never fails with EINTR there is no need for _eintr_retry_call().

If you are using Linux then there must be some other reason for your deadlock.
msg183586 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-03-06 12:31
BTW, on threads are only used on Windows.  On Unix select() or poll() is used.
msg183588 - (view) Author: John Szakmeister (jszakmeister) * Date: 2013-03-06 12:35
Good grief... how did I miss that.  The problem has been flaky for me to induce.  I'll take a closer look at the correct section.  Thank you Richard.
msg183593 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-03-06 14:35
I will close the issue then.

If you track the problem down to a bug in Python then you can open a new one.
History
Date User Action Args
2013-03-06 14:35:16sbtsetstatus: open -> closed
resolution: not a bug
messages: + msg183593

stage: resolved
2013-03-06 12:35:21jszakmeistersetmessages: + msg183588
2013-03-06 12:31:37sbtsetmessages: + msg183586
2013-03-06 12:30:24sbtsetnosy: + sbt
messages: + msg183585
2013-03-06 12:07:50jszakmeistercreate