Author josiahcarlson
Recipients Andrew.Boettcher, ajaksu2, akira, astrand, cvrebert, ericpruitt, eryksun, giampaolo.rodola, gvanrossum, janzert, josiahcarlson, martin.panter, ooooooooo, parameter, r.david.murray, rosslagerwall, sbt, techtonik, v+python, vstinner, yselivanov
Date 2015-03-26.23:39:10
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1427413151.91.0.156244400239.issue1191964@psf.upfronthosting.co.za>
In-reply-to
Content
Non-blocking IO returning a None on "no data currently available" is new to me. It is new to me simply because I don't recall it ever happening in Python 2.x with any socket IO that I ever dealt with, and socket IO is my primary source for non-blocking IO experience.

From my experience, reading from a socket would always return a string, unless the connection was closed/broken, at which point you got a bad file descriptor exception. Writing would return 0 if the buffer was full, or a bad file descriptor exception if the connection was closed/broken. The implementation of asyncore/asynchat at least until 3.0 shows this in practice, and I doubt this has changed in the intervening years since I updated those libraries for 3.0.

My choice to implement this based on BrokenPipeError was based on reading Victor(haypo)'s comments from 2014-07-23 and 2014-07-24, in which he stated that in asyncio, when it comes to process management, .communicate() hides exceptions so that communication can finish, but direct reading and writing surface those exceptions.

I was attempting to mimic the asyncio interface simply because it made sense to me coming from the socket world, and because asyncio is where people are likely to go if the non-blocking subprocess support in subprocess is insufficient for their needs.

Note that I also initially resisted raising an exception in those methods, but Victor(haypo)'s comments changed my mind.

> Do you know a better method that would allow to distinguish between EOF (no future data, ever) and "would block" (future data possible) without calling process.poll()?

Better is subjective when it comes to API, but different? Yes, it's already implemented in patch #8. BrokenPipeError exception when no more data is sendable/receivable. A 0 or b'' when writing to a full buffer, or reading from an empty buffer. This API tries to be the same as the asyncio.subprocess.Process() behavior when accessing stdin, stdout, and stderr directly.

> Returning None for non blocking I/O is standard in Python.

In some contexts, yes. In others, no. The asyncio.StreamReader() coroutines .read(), .readline(), and .readexactly() return strings. Raw async or sync sockets don't return None on read. SSL-wrapped async or sync sockets don't return None on read. Asyncio low-level socket operations don't yield None on a (currently empty) read.


In the context of the subprocess module as it exists in 3.4 (and 3.5 as it is unpatched), the only object that returns None on read when no data is available, but where data *might* be available in the future, is an underlying posix pipe (on posix platforms) - which isn't generally used directly.

The purpose of this patch is to expose stdin, stdout, and stderr in a way that allows non-blocking reads and writes from the subprocess that also plays nicely with .communicate() as necessary. Directly exposing the pipes doesn't work due to API inconsistencies between Windows and posix, so we have to add a layer.

I would argue that this layer of non-blocking (Windows and posix) pipe abstraction has much more in common with how asyncio deals with process IO, or how sockets deal with IO, than it does with pipes in general (the vast majority of which are blocking), so I would contend that it should have a similar API (no returning None).

That said, considering that the expected use of non-blocking subprocess communication *does not* include using a multiplexer (select, poll, etc.), I have the sense that a few other higher-level methods might be warranted to ease use substantially, and which I would expect people would end up using much more often than the lower-level direct read/write operations (different names might make more sense here):
.write_all(data, timeout=None)
.read_available(bufsize=-1) (and .read_stderr_available)
.read_exactly(bufsize, timeout=None) (and .read_stderr_exactly)
History
Date User Action Args
2015-03-26 23:39:12josiahcarlsonsetrecipients: + josiahcarlson, gvanrossum, astrand, parameter, vstinner, techtonik, giampaolo.rodola, ajaksu2, ooooooooo, v+python, r.david.murray, cvrebert, ericpruitt, akira, Andrew.Boettcher, rosslagerwall, sbt, martin.panter, janzert, yselivanov, eryksun
2015-03-26 23:39:11josiahcarlsonsetmessageid: <1427413151.91.0.156244400239.issue1191964@psf.upfronthosting.co.za>
2015-03-26 23:39:11josiahcarlsonlinkissue1191964 messages
2015-03-26 23:39:10josiahcarlsoncreate