classification
Title: subprocess.Popen.communicate can lose data from output/error streams when broken input pipe occures
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.2, Python 2.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder:
Assigned To: Nosy List: Yaniv.Aknin, amaury.forgeotdarc, dwalczak, giampaolo.rodola, mcrute, rosslagerwall
Priority: normal Keywords: needs review, patch

Created on 2009-07-10 12:36 by dwalczak, last changed 2011-04-01 06:33 by rosslagerwall. This issue is now closed.

Files
File name Uploaded Description Edit
subprocess.Popen.communicate_problem_reproduction.zip dwalczak, 2009-07-10 12:36 Problem reproduction code
broken_pipe.patch amaury.forgeotdarc, 2009-07-13 08:28 review
broken_pipe-2.patch amaury.forgeotdarc, 2009-07-13 12:19 review
broken_pipe.patch mcrute, 2010-01-22 02:55 review
Messages (7)
msg90391 - (view) Author: Dariusz Walczak (dwalczak) Date: 2009-07-10 12:36
It's possible to lose data piped through standard output and/or error
streams when large ammounts of data are transfered.

Reproduction:
1) Process A spawns process B with all standard I/O pipes and transfers
large ammount of data to it (100kB in my sample).
2) Process B exits very early (doesn't process all stdin data) and puts
large amount of data into its stderr or stdout streams.
3) IOError/OSError exception if errno variable set to errno.EPIPE is
raised on process A side.

Bug:
When the exception is catched Popen object's stdout and stderr members
may not contain data put into stderr/stdout streams by process B.

Note:
Described behaviour can be observed in Python 2.6.1 on FreeBSD 7.2 as
well as on Windows XP (Pro x64, SP2, up to date).

Real life example of such early end of process B:
openssl smime -verify -CApath /some/dir with large, unsigned email.

I attach simple reproduction code (execute sender.py script).
msg90471 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-07-13 08:28
Testing on Debian with latest trunk:
- the proposed example code works very well (I get all data).
- I added "subprocess._has_poll = False", and some (sometimes all) data
is lost. It seems that select() will return stdin even if it is not
writable.

On Windows of course, communicate() uses a blocking write, and always fail.

The proposed patch ignore the errors when EPIPE is raised, and simply
stops writing.
msg90478 - (view) Author: Dariusz Walczak (dwalczak) Date: 2009-07-13 12:12
Your patch works for me with one exception:

On FreeBSD (python 2.6.1) OSError exception is raised by os.write
instead of IOError exception (raised by self.stdin.write on Windows).
The error code is same on both platforms.
msg90479 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-07-13 12:19
Right, file.write raises IOError when os.write raises OSError :-(
Updated the patch.
msg98130 - (view) Author: Mike Crute (mcrute) Date: 2010-01-22 02:55
Bump. Updated the patch against trunk.
msg102756 - (view) Author: Yaniv Aknin (Yaniv.Aknin) Date: 2010-04-09 22:26
I've updated the sample code to run with Python 3 and tested on Ubuntu 9.10 with and without setting subprocess._has_poll = False.

As expected, when using poll() the sample runs correctly.
When not using poll() the sample breaks (sometimes no data, sometimes some data; oddly, it seems the 'some data' variant appears only when I strace the process, probably because it slows things down).

I suggest this be committed against 3.2 as well, but I don't know how these things are usually done; should I produce a separate patch which ports the latest patch currently posted in this thead against py3k? Or will this be done by whomever applies the patch?
msg132729 - (view) Author: Ross Lagerwall (rosslagerwall) (Python committer) Date: 2011-04-01 06:33
Closing as duplicate of #10963. See #10963 for more discussion.
History
Date User Action Args
2011-04-01 06:33:15rosslagerwallsetstatus: open -> closed

nosy: + rosslagerwall
messages: + msg132729

resolution: duplicate
2010-06-09 18:55:08giampaolo.rodolasetnosy: + giampaolo.rodola
2010-04-09 22:26:38Yaniv.Akninsetnosy: + Yaniv.Aknin

messages: + msg102756
versions: + Python 3.2
2010-01-22 02:55:54mcrutesetfiles: + broken_pipe.patch
nosy: + mcrute
messages: + msg98130

2009-07-13 12:19:15amaury.forgeotdarcsetfiles: + broken_pipe-2.patch

messages: + msg90479
2009-07-13 12:12:15dwalczaksetmessages: + msg90478
2009-07-13 08:28:56amaury.forgeotdarcsetfiles: + broken_pipe.patch

nosy: + amaury.forgeotdarc
messages: + msg90471

keywords: + needs review, patch
stage: patch review
2009-07-10 12:36:04dwalczakcreate