run() - unified high-level interface for subprocess #67531

takluyver · 2015-01-28T22:13:39Z

BPO	23342
Nosy	@warsaw, @gpshead, @ncoghlan, @bitdancer, @ethanfurman, @takluyver, @berkerpeksag, @vadmium
Files	subprocess_run.patch subprocess_run2.patch subprocess_run3.patch subprocess_run4.patch process.py subprocess_run5.patch subprocess_run6a.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/gpshead'
closed_at = <Date 2015-04-26.05:13:16.179>
created_at = <Date 2015-01-28.22:13:38.931>
labels = ['type-feature', 'library']
title = 'run() - unified high-level interface for subprocess'
updated_at = <Date 2016-05-18.05:14:24.030>
user = 'https://github.com/takluyver'

bugs.python.org fields:

activity = <Date 2016-05-18.05:14:24.030>
actor = 'ncoghlan'
assignee = 'gregory.p.smith'
closed = True
closed_date = <Date 2015-04-26.05:13:16.179>
closer = 'gregory.p.smith'
components = ['Library (Lib)']
creation = <Date 2015-01-28.22:13:38.931>
creator = 'takluyver'
dependencies = []
files = ['37897', '37899', '37991', '38072', '38075', '38574', '38997']
hgrepos = []
issue_num = 23342
keywords = ['patch']
message_count = 29.0
messages = ['234918', '234922', '234923', '234924', '234925', '234927', '235093', '235133', '235134', '235300', '235653', '235654', '235656', '235659', '235663', '235726', '236189', '236319', '237079', '238589', '238591', '240274', '240276', '240277', '240961', '241053', '241054', '242032', '265807']
nosy_count = 11.0
nosy_names = ['barry', 'gregory.p.smith', 'ncoghlan', 'r.david.murray', 'cvrebert', 'ethan.furman', 'python-dev', 'takluyver', 'berker.peksag', 'martin.panter', 'Jeff.Hammel']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue23342'
versions = ['Python 3.5']

takluyver · 2015-01-28T22:13:39Z

This follows on from the python-ideas thread starting here: https://mail.python.org/pipermail/python-ideas/2015-January/031479.html

subprocess gains:

A CompletedProcess class representing a process that has finished, with attributes args, returncode, stdout and stderr
A run() function which runs a process to completion and returns a CompletedProcess instance, aiming to unify the functionality of call, check_call and check_output
CalledProcessError and TimeoutExceeded now have a stderr attribute, to avoid throwing away potentially relevant information.

Things I'm not sure about:

Should run() capture stdout/stderr by default? I opted not to, for consistency with Popen and with shells.
I gave run() a check_returncode parameter, but it feels quite a long name for a parameter. Is 'check' clear enough to use as the parameter name?
Popen has an 'args' attribute, while CalledProcessError and TimeoutExpired have 'cmd'. CompletedProcess sits between those cases, so which name should it use? For now, it's args.

takluyver · 2015-01-28T22:46:59Z

Another question: With this patch, CalledProcessError and TimeoutExceeded exceptions now have attributes called output and stderr. It would seem less surprising for output to be called stdout, but we can't break existing code that relies on the output attribute.

Using properties, either stdout or output could be made an alias for the other, so both names work. Is this desirable?

gpshead · 2015-01-28T23:03:41Z

A 1) Opting not to capture by default is good. Let people explicitly request that.

A 2) "check" seems like a reasonable parameter name for the "should i raise if rc != 0" bool. I don't have any other good bikeshed name suggestions.

A 3) Calling it args the same way Popen does is consistent. That the attribute on the exceptions is 'cmd' is a bit of an old wart but seems reasonable. Neither the name 'args' or 'cmd' is actually good for any use in subprocess as it is already an unfortunately multi-typed parameter. It can either be a string or it can be a sequence of strings. The documentation is not clear about what type(s) 'cmd' may be.

A Another) Now that they gain a stderr attribute, having a corresponding stdout one would make sense. Implement it as a property and document it with a versionadded 3.5 as usual.

ethanfurman · 2015-01-28T23:30:59Z

I haven't checked the code, but does check_output and friends combine stdout and stderr when ouput=PIPE?

takluyver · 2015-01-28T23:35:06Z

Updated patch following Gregory's suggestions:

The check_returncode parameter is now called check. The method on CompletedProcess is still check_returncode, though.
Clarified the docs about args
CalledProcessError and TimeoutExceeded gain a stdout property as an alias of output

Ethan: to combine stdout and stderr in check_output, you need to pass stderr=subprocess.STDOUT - it doesn't assume you want that.

I did consider having a simplified interface so you could pass e.g. capture='combine', or capture='stdout', but I don't think the brevity is worth the loss of flexibility.

gpshead · 2015-01-28T23:39:48Z

Ethan: check_output combines them when stdout=subprocess.STDOUT is passed (
https://docs.python.org/3.5/library/subprocess.html#subprocess.STDOUT).
Never pass stdout=PIPE or stderr= PIPE to call() or check*() methods as
that will lead to a deadlock when a pipe buffer fills up. check_output()
won't even allow you pass in stdout as it needs to set that to PIPE
internally, but you could still do the wrong thing and pass stderr=PIPE
without it warning you.

the documentation tells people not to do this. i don't recall why we
haven't made it warn or raise when someone tries. (but that should be a
separate issue/change)

On Wed Jan 28 2015 at 3:30:59 PM Ethan Furman <report@bugs.python.org>
wrote:

Ethan Furman added the comment:

I haven't checked the code, but does check_output and friends combine
stdout and stderr when ouput=PIPE?

----------

Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue23342\>

vadmium · 2015-01-31T09:46:33Z

Maybe you don’t want to touch the implementation of the “older high-level API” for fear of subtly breaking something, but for clarification, and perhaps documentation, would the old functions now be equivalent to this?

def call(***):
    # Verify PIPE not in (stdout, stderr) if needed
    return run(***).returncode
def check_call(***):
    # Verify PIPE not in (stdout, stderr) if needed
    run(***, check=True)
def check_output(***):
    # Verify stderr != PIPE if needed
    return run(***, check=True, stdout=PIPE)

If they are largely equivalent, perhaps simplify the documentation of them in terms of run(), and move them closer to the run() documentation.

Is it worth making the CalledProcessError exception a subclass of CompletedProcess? They seem to be basically storing the same information.

takluyver · 2015-01-31T21:57:24Z

Yep, they are pretty much equivalent to those, except:

check_call has a 'return 0' if it succeeds
add '.stdout' to the end of the expression for check_output

I'll work on documenting the trio in those terms.

If people want, some/all of the trio could also be implemented on top of run(). check_output() would be the most likely candidate for this, since I copied that code to create run(). I'd probably leave call and check_call as separate implementations to avoid subtle bugs, though.

Sharing inheritance between CalledProcessError and CompletedProcess: That would mean that either CompletedProcess is an exception class, even though it's not used as such, or CalledProcessError uses multiple inheritance. I think duplicating a few attributes is preferable to having to think about multiple inheritance, especially since the names aren't all the same (cmd vs args, output vs stdout).

vadmium · 2015-01-31T22:34:05Z

It’s okay to leave them as independent classes, if you don’t want multiple inheritance. I was just putting the idea out there. It is a similar pattern to the HTTPError exception and HTTPResponse return value for urlopen().

takluyver · 2015-02-02T23:19:13Z

Third version of the patch (subprocess_run3):

Simplifies the documentation of the trio (call, check_call, check_output) to describe them in terms of the equivalent run() call.
Remove a warning about using PIPE with check_output - I believe this was already incorrect, since check_output uses .communicate() internally, it shouldn't have deadlock issues.
Replace the implementation of check_output() with a call to run().

I didn't reimplement call or check_call - as previously discussed, they are more different from the code in run(), so subtly breaking things is more possible. They are also simpler.

takluyver · 2015-02-10T01:36:01Z

Would anyone like to do further review of this - or commit it ;-) ?

I don't think anyone has objected to the concept since I brought it up on python-ideas, but if anyone is -1, please say so.

vadmium · 2015-02-10T01:50:37Z

Have you seen the code review comments on the Rietveld, <https://bugs.python.org/review/23342\>? (Maybe check spam emails.) Many of the comments from the earlier patches still stand. In particular, I would like to see the “input” default value addressed, at least for the new run() function, if not the old check_output() function.

takluyver · 2015-02-10T01:56:10Z

Aha, I hadn't seen any of those. They had indeed been caught by the spam filter. I'll look over them now.

takluyver · 2015-02-10T02:55:03Z

Fourth version of patch, responding to review comments on Rietveld. The major changes are:

Eliminated the corner case when passing input=None to run() - now it's a real default parameter. Added a shim in check_output to keep it behaving the old way in case anything is relying on it, but I didn't document it.
The docstring of run() was shortened quite a bit by removing the examples.
Added a whatsnew entry

I also made various minor fixes - thanks to everyone who found them.

JeffHammel · 2015-02-10T03:49:22Z

A few observations in passing. I beg your pardon for not commenting after a more in depth study of the issue, but as someone that's written and managed several subprocess module front-ends, my general observations seem applicable.

subprocess needs easier and more robust ways of managing input and output streams

subprocess should have easier ways of managing input: file streams are fine, but plain strings would also be nice

for string commands, shell should always be true. for list/Tupperware commands, shell should be false. in fact you'll get an error if you don't ensure this. instead, just have what is passed key execution (for windows, I have no idea. I'm lucky enough not to write windows software these days)

subprocess should always terminate processes on program exit robustly (unless asked not too). I always have a hard time figuring out how to get processes to terminate, and how to have them not to. I realize POSIX is black magic, to some degree.

I'm attaching a far from perfect front end that I currently use for reference

takluyver · 2015-02-11T04:10:29Z

Jeff: This makes it somewhat easier to handle input and output as strings instead of streams. Most of the functionality was already there, but this makes it more broadly useful. It doesn't especially address your other points, but I'm not aiming to completely overhaul subprocess.

for string commands, shell should always be true. for list/Tupperware commands, shell should be false

I wondered why this is not the case before, but on Windows a subprocess is actually launched by a string, not a list. And on POSIX, a string without shell=True is interpreted like a one-element list, so you can do e.g. Popen('ls') instead of Popen(['ls']). Changing that would probably break backwards compatibility in unexpected ways.

bitdancer · 2015-02-18T18:49:41Z

string vs list: see bpo-6760 for some background. Yes, I think it is an API bug, but there is no consensus for fixing it (it would require a deprecation period).

Jeff: in general your points to do not seem to be apropos to this particular proposed enhancement, but are instead addressing other aspects of subprocess and should be dealt with in other targeted issues.

takluyver · 2015-02-20T18:57:07Z

Can I interest any of you in further review? I think I have responded to all comments so far. Thanks!

takluyver · 2015-03-02T20:07:22Z

Is there anything further I should be doing for this?

vadmium · 2015-03-20T00:11:05Z

One thing that just popped into my mind that I don’t think has been discussed: The patch adds the new run() function to subprocess.__all__, but the CompletedProcess class is still missing. Was that an oversight or a conscious decision?

takluyver · 2015-03-20T00:31:58Z

Thanks, that was an oversight. Patch 5 adds CompletedProcess to __all__.

takluyver · 2015-04-08T15:32:13Z

I am still keen for this to move forwards. I am at PyCon if anyone wants to discuss it in person.

gpshead · 2015-04-08T15:42:50Z

I'm at pycon as well, we can get this taken care of here. :)

takluyver · 2015-04-08T15:47:41Z

Great! I'm free after my IPython tutorial this afternoon, all of tomorrow, and I'm around for the sprints.

takluyver · 2015-04-14T17:22:46Z

6a following in-person review with Gregory:

Reapplied to the updated codebase.
Docs: mention the older functions near the top, because they'll still be important for some time.
Docs: Be explicit that combined stdout/stderr goes in stdout attribute.
Various improvements to code style

python-dev · 2015-04-14T23:14:43Z

New changeset f0a00ee094ff by Gregory P. Smith in branch 'default':
Add a subprocess.run() function than returns a CalledProcess instance for a
https://hg.python.org/cpython/rev/f0a00ee094ff

gpshead · 2015-04-14T23:16:08Z

thanks! i'll close this later after some buildbot runs and any post-commit reviews.

takluyver · 2015-04-25T23:50:15Z

I expect this can be closed now, unless there's some post-commit review somewhere that needs addressing?

ncoghlan · 2016-05-18T05:14:24Z

This change has made the subprocess docs intimidating and unapproachable again - this is a *LOWER* level swiss-army knife API than the 3 high level convenience functions.

I've filed http://bugs.python.org/issue27050 to suggest changing the way this is documented to position run() as a mid-tier API that's more flexible than the high level API, but still more convenient than accessing subprocess.Popen directly.

takluyver mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jan 28, 2015

gpshead self-assigned this Jan 28, 2015

gpshead closed this as completed Apr 26, 2015

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run() - unified high-level interface for subprocess #67531

run() - unified high-level interface for subprocess #67531

takluyver mannequin commented Jan 28, 2015

takluyver mannequin commented Jan 28, 2015

takluyver mannequin commented Jan 28, 2015

gpshead commented Jan 28, 2015

ethanfurman commented Jan 28, 2015

takluyver mannequin commented Jan 28, 2015

gpshead commented Jan 28, 2015

vadmium commented Jan 31, 2015

takluyver mannequin commented Jan 31, 2015

vadmium commented Jan 31, 2015

takluyver mannequin commented Feb 2, 2015

takluyver mannequin commented Feb 10, 2015

vadmium commented Feb 10, 2015

takluyver mannequin commented Feb 10, 2015

takluyver mannequin commented Feb 10, 2015

JeffHammel mannequin commented Feb 10, 2015

takluyver mannequin commented Feb 11, 2015

bitdancer commented Feb 18, 2015

takluyver mannequin commented Feb 20, 2015

takluyver mannequin commented Mar 2, 2015

vadmium commented Mar 20, 2015

takluyver mannequin commented Mar 20, 2015

takluyver mannequin commented Apr 8, 2015

gpshead commented Apr 8, 2015

takluyver mannequin commented Apr 8, 2015

takluyver mannequin commented Apr 14, 2015

python-dev mannequin commented Apr 14, 2015

gpshead commented Apr 14, 2015

takluyver mannequin commented Apr 25, 2015

ncoghlan commented May 18, 2016

Navigation Menu

run() - unified high-level interface for subprocess #67531

run() - unified high-level interface for subprocess #67531

Comments

takluyver mannequin commented Jan 28, 2015

takluyver mannequin commented Jan 28, 2015

takluyver mannequin commented Jan 28, 2015

gpshead commented Jan 28, 2015

ethanfurman commented Jan 28, 2015

takluyver mannequin commented Jan 28, 2015

gpshead commented Jan 28, 2015

vadmium commented Jan 31, 2015

takluyver mannequin commented Jan 31, 2015

vadmium commented Jan 31, 2015

takluyver mannequin commented Feb 2, 2015

takluyver mannequin commented Feb 10, 2015

vadmium commented Feb 10, 2015

takluyver mannequin commented Feb 10, 2015

takluyver mannequin commented Feb 10, 2015

JeffHammel mannequin commented Feb 10, 2015

takluyver mannequin commented Feb 11, 2015

bitdancer commented Feb 18, 2015

takluyver mannequin commented Feb 20, 2015

takluyver mannequin commented Mar 2, 2015

vadmium commented Mar 20, 2015

takluyver mannequin commented Mar 20, 2015

takluyver mannequin commented Apr 8, 2015

gpshead commented Apr 8, 2015

takluyver mannequin commented Apr 8, 2015

takluyver mannequin commented Apr 14, 2015

python-dev mannequin commented Apr 14, 2015

gpshead commented Apr 14, 2015

takluyver mannequin commented Apr 25, 2015

ncoghlan commented May 18, 2016