classification
Title: subprocess is not safe from deadlocks
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.3, Python 2.7
process
Status: closed Resolution: duplicate
Dependencies: Superseder: subprocess: more general (non-buffering) communication
View: 1260171
Assigned To: Nosy List: cvrebert, rosslagerwall, sbt, techtonik, weirdink13
Priority: normal Keywords:

Created on 2012-05-21 20:33 by techtonik, last changed 2012-05-22 08:45 by rosslagerwall. This issue is now closed.

Messages (6)
msg161294 - (view) Author: anatoly techtonik (techtonik) Date: 2012-05-21 20:33
There is no way to write a program in Python capable to process large/unlimited output coming from a subprocess stream without deadlocks.

http://docs.python.org/library/subprocess.html#subprocess.Popen.communicate
    "Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited."

http://docs.python.org/library/subprocess.html#subprocess.Popen.stdin
http://docs.python.org/library/subprocess.html#subprocess.Popen.stdout
http://docs.python.org/library/subprocess.html#subprocess.Popen.stderr
    "Warning Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process."


So, what should I use?
msg161303 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-05-21 22:24
I think the note for communicate() just means that you might get MemoryError (or some other exception) if the output is too big.  But I agree it is ambiguous.

communicate() uses select() on Unix and threads on Windows, so deadlocks should not be possible.

> So, what should I use?

Use communicate() (on a machine with infinite memory;-)
msg161312 - (view) Author: Daniel Swanson (weirdink13) Date: 2012-05-22 01:34
what sort of machine has infinite memory?
msg161322 - (view) Author: Ross Lagerwall (rosslagerwall) (Python committer) Date: 2012-05-22 05:16
Well if you're *certain* that the process is only using one stream, then you can just use read/write on that stream.

If not, it probably means you have to use either threads or select/poll.

This is a known issue with subprocess; there are a few proposals on the tracker about this. See issue1191964 for example.
msg161325 - (view) Author: anatoly techtonik (techtonik) Date: 2012-05-22 05:43
The problem with memory is more actual for machines with SSD where swap is usually turned off and /tmp files are located on memory disk. Hitting memory limit often means hard reset.

My process is pretty generic that uses all streams, and I don't know how to use threads/polls crossplatform way.

issue1191964 looks interesting.
msg161337 - (view) Author: Ross Lagerwall (rosslagerwall) (Python committer) Date: 2012-05-22 08:45
See also issue1260171.

Closing as a duplicate of that.
History
Date User Action Args
2012-05-22 08:45:11rosslagerwallsetstatus: open -> closed
superseder: subprocess: more general (non-buffering) communication
messages: + msg161337

type: enhancement
resolution: duplicate
stage: resolved
2012-05-22 05:43:46techtoniksetmessages: + msg161325
2012-05-22 05:16:54rosslagerwallsetnosy: + rosslagerwall
messages: + msg161322
2012-05-22 01:34:11weirdink13setnosy: + weirdink13
messages: + msg161312
2012-05-22 01:17:21cvrebertsetnosy: + cvrebert
2012-05-21 22:24:08sbtsetnosy: + sbt
messages: + msg161303
2012-05-21 20:33:12techtonikcreate