classification
Title: run() - unified high-level interface for subprocess
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: Jeff.Hammel, barry, berker.peksag, cvrebert, ethan.furman, gregory.p.smith, martin.panter, ncoghlan, python-dev, r.david.murray, takluyver
Priority: normal Keywords: patch

Created on 2015-01-28 22:13 by takluyver, last changed 2016-05-18 05:14 by ncoghlan. This issue is now closed.

Files
File name Uploaded Description Edit
subprocess_run.patch takluyver, 2015-01-28 22:13 review
subprocess_run2.patch takluyver, 2015-01-28 23:35 review
subprocess_run3.patch takluyver, 2015-02-02 23:19 review
subprocess_run4.patch takluyver, 2015-02-10 02:55 review
process.py Jeff.Hammel, 2015-02-10 03:49
subprocess_run5.patch takluyver, 2015-03-20 00:31 review
subprocess_run6a.patch takluyver, 2015-04-14 17:22 review
Messages (29)
msg234918 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-01-28 22:13
This follows on from the python-ideas thread starting here: https://mail.python.org/pipermail/python-ideas/2015-January/031479.html

subprocess gains:

- A CompletedProcess class representing a process that has finished, with attributes args, returncode, stdout and stderr
- A run() function which runs a process to completion and returns a CompletedProcess instance, aiming to unify the functionality of call, check_call and check_output
- CalledProcessError and TimeoutExceeded now have a stderr attribute, to avoid throwing away potentially relevant information.

Things I'm not sure about:

1. Should run() capture stdout/stderr by default? I opted not to, for consistency with Popen and with shells.
2. I gave run() a check_returncode parameter, but it feels quite a long name for a parameter. Is 'check' clear enough to use as the parameter name?
3. Popen has an 'args' attribute, while CalledProcessError and TimeoutExpired have 'cmd'. CompletedProcess sits between those cases, so which name should it use? For now, it's args.
msg234922 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-01-28 22:46
Another question: With this patch, CalledProcessError and TimeoutExceeded exceptions now have attributes called output and stderr. It would seem less surprising for output to be called stdout, but we can't break existing code that relies on the output attribute.

Using properties, either stdout or output could be made an alias for the other, so both names work. Is this desirable?
msg234923 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2015-01-28 23:03
A 1) Opting not to capture by default is good.  Let people explicitly request that.

A 2) "check" seems like a reasonable parameter name for the "should i raise if rc != 0" bool.  I don't have any other good bikeshed name suggestions.

A 3) Calling it args the same way Popen does is consistent.  That the attribute on the exceptions is 'cmd' is a bit of an old wart but seems reasonable.  Neither the name 'args' or 'cmd' is actually good for any use in subprocess as it is already an unfortunately multi-typed parameter.  It can either be a string or it can be a sequence of strings.  The documentation is not clear about what type(s) 'cmd' may be.

A Another) Now that they gain a stderr attribute, having a corresponding stdout one would make sense.  Implement it as a property and document it with a versionadded 3.5 as usual.
msg234924 - (view) Author: Ethan Furman (ethan.furman) * (Python committer) Date: 2015-01-28 23:30
I haven't checked the code, but does check_output and friends combine stdout and stderr when ouput=PIPE?
msg234925 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-01-28 23:35
Updated patch following Gregory's suggestions:

- The check_returncode parameter is now called check. The method on CompletedProcess is still check_returncode, though.
- Clarified the docs about args
- CalledProcessError and TimeoutExceeded gain a stdout property as an alias of output

Ethan: to combine stdout and stderr in check_output, you need to pass stderr=subprocess.STDOUT - it doesn't assume you want that.

I did consider having a simplified interface so you could pass e.g. capture='combine', or capture='stdout', but I don't think the brevity is worth the loss of flexibility.
msg234927 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2015-01-28 23:39
Ethan: check_output combines them when stdout=subprocess.STDOUT is passed (
https://docs.python.org/3.5/library/subprocess.html#subprocess.STDOUT).
Never pass stdout=PIPE or stderr= PIPE to call() or check*() methods as
that will lead to a deadlock when a pipe buffer fills up.  check_output()
won't even allow you pass in stdout as it needs to set that to PIPE
internally, but you could still do the wrong thing and pass stderr=PIPE
without it warning you.

the documentation tells people not to do this.  i don't recall why we
haven't made it warn or raise when someone tries.  (but that should be a
separate issue/change)

On Wed Jan 28 2015 at 3:30:59 PM Ethan Furman <report@bugs.python.org>
wrote:

>
> Ethan Furman added the comment:
>
> I haven't checked the code, but does check_output and friends combine
> stdout and stderr when ouput=PIPE?
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue23342>
> _______________________________________
>
msg235093 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-01-31 09:46
Maybe you don’t want to touch the implementation of the “older high-level API” for fear of subtly breaking something, but for clarification, and perhaps documentation, would the old functions now be equivalent to this?

def call(***):
    # Verify PIPE not in (stdout, stderr) if needed
    return run(***).returncode
def check_call(***):
    # Verify PIPE not in (stdout, stderr) if needed
    run(***, check=True)
def check_output(***):
    # Verify stderr != PIPE if needed
    return run(***, check=True, stdout=PIPE)

If they are largely equivalent, perhaps simplify the documentation of them in terms of run(), and move them closer to the run() documentation.

Is it worth making the CalledProcessError exception a subclass of CompletedProcess? They seem to be basically storing the same information.
msg235133 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-01-31 21:57
Yep, they are pretty much equivalent to those, except:

- check_call has a 'return 0' if it succeeds
- add '.stdout' to the end of the expression for check_output

I'll work on documenting the trio in those terms.

If people want, some/all of the trio could also be implemented on top of run(). check_output() would be the most likely candidate for this, since I copied that code to create run(). I'd probably leave call and check_call as separate implementations to avoid subtle bugs, though.

Sharing inheritance between CalledProcessError and CompletedProcess: That would mean that either CompletedProcess is an exception class, even though it's not used as such, or CalledProcessError uses multiple inheritance. I think duplicating a few attributes is preferable to having to think about multiple inheritance, especially since the names aren't all the same (cmd vs args, output vs stdout).
msg235134 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-01-31 22:34
It’s okay to leave them as independent classes, if you don’t want multiple inheritance. I was just putting the idea out there. It is a similar pattern to the HTTPError exception and HTTPResponse return value for urlopen().
msg235300 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-02-02 23:19
Third version of the patch (subprocess_run3):

- Simplifies the documentation of the trio (call, check_call, check_output) to describe them in terms of the equivalent run() call.
- Remove a warning about using PIPE with check_output - I believe this was already incorrect, since check_output uses .communicate() internally, it shouldn't have deadlock issues.
- Replace the implementation of check_output() with a call to run().

I didn't reimplement call or check_call - as previously discussed, they are more different from the code in run(), so subtly breaking things is more possible. They are also simpler.
msg235653 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-02-10 01:36
Would anyone like to do further review of this - or commit it ;-) ?

I don't think anyone has objected to the concept since I brought it up on python-ideas, but if anyone is -1, please say so.
msg235654 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-10 01:50
Have you seen the code review comments on the Rietveld, <https://bugs.python.org/review/23342>? (Maybe check spam emails.) Many of the comments from the earlier patches still stand. In particular, I would like to see the “input” default value addressed, at least for the new run() function, if not the old check_output() function.
msg235656 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-02-10 01:56
Aha, I hadn't seen any of those. They had indeed been caught by the spam filter. I'll look over them now.
msg235659 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-02-10 02:55
Fourth version of patch, responding to review comments on Rietveld. The major changes are:

- Eliminated the corner case when passing input=None to run() - now it's a real default parameter. Added a shim in check_output to keep it behaving the old way in case anything is relying on it, but I didn't document it.
- The docstring of run() was shortened quite a bit by removing the examples.
- Added a whatsnew entry

I also made various minor fixes - thanks to everyone who found them.
msg235663 - (view) Author: Jeff Hammel (Jeff.Hammel) Date: 2015-02-10 03:49
A few observations in passing.  I beg your pardon for not commenting after a more in depth study of the issue, but as someone that's written and managed several subprocess module front-ends, my general observations seem applicable.

subprocess needs easier and more robust ways of managing input and output streams

subprocess should have easier ways of managing input: file streams are fine, but plain strings would also be nice

for string commands, shell should always be true. for list/Tupperware commands, shell should be false. in fact you'll get an error if you don't ensure this. instead, just have what is passed key execution (for windows, I have no idea. I'm lucky enough not to write windows software these days)

subprocess should always terminate processes on program exit robustly (unless asked not too). I always have a hard time figuring out how to get processes to terminate, and how to have them not to.  I realize POSIX is black magic, to some degree.

I'm attaching a far from perfect front end that I currently use for reference
msg235726 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-02-11 04:10
Jeff: This makes it somewhat easier to handle input and output as strings instead of streams. Most of the functionality was already there, but this makes it more broadly useful. It doesn't especially address your other points, but I'm not aiming to completely overhaul subprocess.

> for string commands, shell should always be true. for list/Tupperware commands, shell should be false

I wondered why this is not the case before, but on Windows a subprocess is actually launched by a string, not a list. And on POSIX, a string without shell=True is interpreted like a one-element list, so you can do e.g. Popen('ls') instead of Popen(['ls']). Changing that would probably break backwards compatibility in unexpected ways.
msg236189 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-02-18 18:49
string vs list: see issue 6760 for some background.  Yes, I think it is an API bug, but there is no consensus for fixing it (it would require a deprecation period).

Jeff: in general your points to do not seem to be apropos to this particular proposed enhancement, but are instead addressing other aspects of subprocess and should be dealt with in other targeted issues.
msg236319 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-02-20 18:57
Can I interest any of you in further review? I think I have responded to all comments so far. Thanks!
msg237079 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-03-02 20:07
Is there anything further I should be doing for this?
msg238589 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-03-20 00:11
One thing that just popped into my mind that I don’t think has been discussed: The patch adds the new run() function to subprocess.__all__, but the CompletedProcess class is still missing. Was that an oversight or a conscious decision?
msg238591 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-03-20 00:31
Thanks, that was an oversight. Patch 5 adds CompletedProcess to __all__.
msg240274 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-04-08 15:32
I am still keen for this to move forwards. I am at PyCon if anyone wants to discuss it in person.
msg240276 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2015-04-08 15:42
I'm at pycon as well, we can get this taken care of here. :)
msg240277 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-04-08 15:47
Great! I'm free after my IPython tutorial this afternoon, all of tomorrow, and I'm around for the sprints.
msg240961 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-04-14 17:22
6a following in-person review with Gregory:

- Reapplied to the updated codebase.
- Docs: mention the older functions near the top, because they'll still be important for some time.
- Docs: Be explicit that combined stdout/stderr goes in stdout attribute.
- Various improvements to code style
msg241053 - (view) Author: Roundup Robot (python-dev) Date: 2015-04-14 23:14
New changeset f0a00ee094ff by Gregory P. Smith in branch 'default':
Add a subprocess.run() function than returns a CalledProcess instance for a
https://hg.python.org/cpython/rev/f0a00ee094ff
msg241054 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2015-04-14 23:16
thanks!  i'll close this later after some buildbot runs and any post-commit reviews.
msg242032 - (view) Author: Thomas Kluyver (takluyver) * Date: 2015-04-25 23:50
I expect this can be closed now, unless there's some post-commit review somewhere that needs addressing?
msg265807 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-05-18 05:14
This change has made the subprocess docs intimidating and unapproachable again - this is a *LOWER* level swiss-army knife API than the 3 high level convenience functions.

I've filed http://bugs.python.org/issue27050 to suggest changing the way this is documented to position run() as a mid-tier API that's more flexible than the high level API, but still more convenient than accessing subprocess.Popen directly.
History
Date User Action Args
2016-05-18 05:14:24ncoghlansetnosy: + ncoghlan
messages: + msg265807
2015-04-26 07:51:52berker.peksagsetstage: commit review -> resolved
2015-04-26 05:13:16gregory.p.smithsetstatus: open -> closed
resolution: fixed
2015-04-25 23:50:14takluyversetmessages: + msg242032
2015-04-14 23:16:08gregory.p.smithsetmessages: + msg241054
stage: patch review -> commit review
2015-04-14 23:14:42python-devsetnosy: + python-dev
messages: + msg241053
2015-04-14 17:22:46takluyversetfiles: + subprocess_run6a.patch

messages: + msg240961
2015-04-08 15:47:40takluyversetmessages: + msg240277
2015-04-08 15:42:50gregory.p.smithsetmessages: + msg240276
2015-04-08 15:32:13takluyversetmessages: + msg240274
2015-03-20 07:22:12berker.peksagsetnosy: + berker.peksag

stage: patch review
2015-03-20 00:31:58takluyversetfiles: + subprocess_run5.patch

messages: + msg238591
2015-03-20 00:11:05martin.pantersetmessages: + msg238589
2015-03-02 20:07:21takluyversetmessages: + msg237079
2015-02-20 18:57:07takluyversetmessages: + msg236319
2015-02-18 18:49:40r.david.murraysetmessages: + msg236189
2015-02-11 04:10:29takluyversetmessages: + msg235726
2015-02-10 03:49:23Jeff.Hammelsetfiles: + process.py
nosy: + Jeff.Hammel
messages: + msg235663

2015-02-10 02:55:07takluyversetfiles: + subprocess_run4.patch

messages: + msg235659
2015-02-10 01:56:09takluyversetmessages: + msg235656
2015-02-10 01:50:37martin.pantersetmessages: + msg235654
2015-02-10 01:36:01takluyversetmessages: + msg235653
2015-02-06 17:33:51cvrebertsetnosy: + cvrebert
2015-02-02 23:19:13takluyversetfiles: + subprocess_run3.patch

messages: + msg235300
2015-01-31 22:34:04martin.pantersetmessages: + msg235134
2015-01-31 21:57:24takluyversetmessages: + msg235133
2015-01-31 09:46:32martin.pantersetmessages: + msg235093
2015-01-29 00:04:45martin.pantersetnosy: + martin.panter
2015-01-28 23:39:47gregory.p.smithsetmessages: + msg234927
2015-01-28 23:35:06takluyversetfiles: + subprocess_run2.patch

messages: + msg234925
2015-01-28 23:30:59ethan.furmansetmessages: + msg234924
2015-01-28 23:03:40gregory.p.smithsetmessages: + msg234923
2015-01-28 22:53:50ethan.furmansetnosy: + ethan.furman
2015-01-28 22:51:49gregory.p.smithsetassignee: gregory.p.smith

nosy: + gregory.p.smith
2015-01-28 22:46:58takluyversetmessages: + msg234922
2015-01-28 22:41:10r.david.murraysetnosy: + r.david.murray
2015-01-28 22:16:04barrysetnosy: + barry
2015-01-28 22:13:38takluyvercreate