Issue1606
Created on 2007-12-13 02:04 by christian.heimes, last changed 2008-08-04 01:04 by gregory.p.smith.
| Messages (9) | |||
|---|---|---|---|
| msg58514 - (view) | Author: Christian Heimes (christian.heimes) | Date: 2007-12-13 02:04 | |
The subprocess docs need a warning that code like p = subprocess.Popen(..., stdout=STDOUT) p.wait() p.stdout.read() can block indefinitely if the program fills the stdout buffer. It needs an example how to do it right but I don't know the best way to solve the problem. |
|||
| msg58516 - (view) | Author: Guido van Rossum (gvanrossum) | Date: 2007-12-13 02:13 | |
Why not simply reverse the wait() and read() calls? |
|||
| msg58518 - (view) | Author: Christian Heimes (christian.heimes) | Date: 2007-12-13 02:25 | |
Guido van Rossum wrote: > Why not simply reverse the wait() and read() calls? I don't think it is sufficient if the user uses more than one pipe. The subprocess.communicate() and _communicate() methods are using threads (Windows) or select (Unix) when multiple pipes for stdin, stderr and stderr are involved. The only safe way with multiple pipes is communicate([input]) unless the process returns *lots* of data. The subprocess module is buffering the data in memory if the user uses PIPE or STDIN. The subprocess module also contains a XXX comment in the unix version of _communicate() # XXX Rewrite these to use non-blocking I/O on the # file objects; they are no longer using C stdio! Should I create another bug entry for it? Christian |
|||
| msg58538 - (view) | Author: Guido van Rossum (gvanrossum) | Date: 2007-12-13 17:49 | |
> > Why not simply reverse the wait() and read() calls? > > I don't think it is sufficient if the user uses more than one pipe. The > subprocess.communicate() and _communicate() methods are using threads > (Windows) or select (Unix) when multiple pipes for stdin, stderr and > stderr are involved. That is done precisely to *avoid* blocking. I believe the only reason your example blocks is because you wait before reading -- you should do it the other way around, do all I/O first and *then* wait for the process to exit. > The only safe way with multiple pipes is communicate([input]) unless the > process returns *lots* of data. The subprocess module is buffering the > data in memory if the user uses PIPE or STDIN. I disagree. I don't believe it will block unless you make the mistake of waiting for the process first. > The subprocess module also contains a XXX comment in the unix version of > _communicate() > > # XXX Rewrite these to use non-blocking I/O on the > # file objects; they are no longer using C stdio! > > Should I create another bug entry for it? No, we have too many bug entries already. |
|||
| msg58589 - (view) | Author: Christian Heimes (christian.heimes) | Date: 2007-12-13 21:14 | |
Guido van Rossum wrote:
> That is done precisely to *avoid* blocking. I believe the only reason
> your example blocks is because you wait before reading -- you should
> do it the other way around, do all I/O first and *then* wait for the
> process to exit.
I believe so, too. The subprocess docs aren't warning about the problem.
I've seen a fair share of programmers who fall for the trap - including
me a few weeks ago.
> I disagree. I don't believe it will block unless you make the mistake
> of waiting for the process first.
Consider yet another example
>>> p = Popen(someprogram, stdin=PIPE, stdout=PIPE)
>>> p.stdin.write(10MB of data)
someprogram processes the incoming data in small blocks. Let's say 1KB
and 1MB stdin and stdout buffer. It reads 1KB from stdin and writes 1KB
to stdout until the stdout buffer is full. The program stops and waits
for for Python to free the stdout buffer. However the python code is
still writing data to the limited stdin buffer.
>>> data = p.stout.read()
Is the scenario realistic?
I tried it.
*** This works although it is slow
$ cat img_0948.jpg | convert - png:- >test
*** This example does not work. The test file is created but no data is
written to the file.
p = subprocess.Popen(["convert", "-", "png:-"],
stdin=subprocess.PIPE, stdout=subprocess.PIPE)
img = open("img_0948.jpg", "rb")
p.stdin.write(img.read())
with open("test", "wb") as f:
f.write(p.stdout.read())
*** It works with communicate:
with open("test", "wb") as f:
out, err = p.communicate(img.read())
f.write(out)
Christian
|
|||
| msg58591 - (view) | Author: Raghuram Devarakonda (draghuram) | Date: 2007-12-13 21:32 | |
Look at #1256 for similar report. A doc change was suggested there as well. |
|||
| msg58594 - (view) | Author: Guido van Rossum (gvanrossum) | Date: 2007-12-13 22:03 | |
> I believe so, too. The subprocess docs aren't warning about the problem. > I've seen a fair share of programmers who fall for the trap - including > me a few weeks ago. Yes, the docs should definitely address this. > Consider yet another example > > >>> p = Popen(someprogram, stdin=PIPE, stdout=PIPE) > >>> p.stdin.write(10MB of data) > > someprogram processes the incoming data in small blocks. Let's say 1KB > and 1MB stdin and stdout buffer. It reads 1KB from stdin and writes 1KB > to stdout until the stdout buffer is full. The program stops and waits > for for Python to free the stdout buffer. However the python code is > still writing data to the limited stdin buffer. Hm. I thought this would be handled using threads or select but it doesn't seem to be quite the case. communicate() does the right thing but if you use p.stdin.write() directly you may indeed hang. |
|||
| msg69396 - (view) | Author: Gregory P. Smith (gregory.p.smith) | Date: 2008-07-07 20:47 | |
i'll come up with something for the documentation on this. |
|||
| msg70674 - (view) | Author: Gregory P. Smith (gregory.p.smith) | Date: 2008-08-04 01:04 | |
See the documentation update in trunk r65469. It adds warnings about both common pipe related pitfalls discussed in this bug. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2008-08-04 01:04:48 | gregory.p.smith | set | status: open -> closed resolution: accepted -> fixed messages: + msg70674 |
| 2008-07-07 20:47:39 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages: + msg69396 resolution: accepted assignee: gregory.p.smith type: behavior |
| 2008-01-06 14:12:46 | christian.heimes | link | issue1256 superseder |
| 2007-12-13 22:03:36 | gvanrossum | set | messages: + msg58594 |
| 2007-12-13 21:32:13 | draghuram | set | nosy:
+ draghuram messages: + msg58591 |
| 2007-12-13 21:14:17 | christian.heimes | set | messages: + msg58589 |
| 2007-12-13 17:49:53 | gvanrossum | set | messages: + msg58538 |
| 2007-12-13 02:25:02 | christian.heimes | set | messages: + msg58518 |
| 2007-12-13 02:13:04 | gvanrossum | set | nosy:
+ gvanrossum messages: + msg58516 |
| 2007-12-13 02:04:38 | christian.heimes | create | |