classification
Title: Popen should raise ValueError if pass a string when shell=False or a list when shell=True
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.3
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Alexander.Belopolsky, Arfrever, asvetlov, brian.curtin, csernazs, cvrebert, dstanek, eric.araujo, eric.smith, giampaolo.rodola, haypo, holdenweb, larry, ncoghlan, r.david.murray
Priority: normal Keywords:

Created on 2010-02-02 19:08 by r.david.murray, last changed 2013-11-06 19:37 by akuchling.

Files
File name Uploaded Description Edit
test_subprocess.py larry, 2010-06-08 10:49 Test case for this bug, works in Python 2 and Python 3.
Messages (31)
msg98754 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-02-02 19:08
Currently Popen accepts either a string or a list regardless of the value of 'shell'.  In the shell=False case, a string is interpreted as the command name, no matter what it actually is.  In the shell=True case, a list is interpreted as args[0] being the string to pass to sh -c, and the remaining elements are passed as additional arguments to sh.

Neither of these behaviors is particularly useful.  It is easy enough to put the command name in a single element list with shell=False, and as far as I've been able to tell there is nothing useful you can do with passing arguments to sh after the '-c [command'] argument.

Further, the current behavior leads to common mistakes and misunderstandings, especially the acceptance of a string when shell=False.  The inexperienced user will pass the complete command string, and get the confusing error message "No such file or directory".

The behavior when passing a list when shell=True is even more mysterious.  In this case, all elements of the list after args[0] appear to the programmer as if they are ignored.  This problem is made worse by the fact that the documentation for this case makes it sound as if multiple strings *can* be passed; this confusion at least will be cleared up by the patch from issue 6760.

I would like to see Popen changed so that when shell=False, passing a string for args results in a ValueError, and likewise when shell=True, passing a list results in a ValueError.

Since this is a backward incompatible change, we'd need to first deprecate the current behavior in 3.2, and then introduce the ValueError in 3.3.  While this would be annoying to those people who needed to change their programs, I think it would be worth it for the improved error feedback the revised API would provide.
msg100202 - (view) Author: Christophe Simonis (Christophe Simonis) Date: 2010-02-28 10:08
In this case, why not just depreciate the shell argument ?
msg100205 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-02-28 15:25
Because the shell argument provides important functionality.  Or are you suggesting that passing a list implies shell=False and passing a string implies shell=True?  That is a possibility, but I think it would not be a good idea, because people will 'accidentally' get shell=True (which, you will note is *not* the default), which is less secure.  It is much better, IMO, to make people ask explicitly for the less-safe behavior.
msg100206 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-02-28 16:12
I think the better design is to have 2 distinct APIs: Popen_shell and Popen_exec. I'm not wild about the name Popen_exec, suggestions welcome. Neither of these would accept a shell parameter.

For starters these could be convenience APIs that just call Popen with the correct shell= value, and check the type of args. We could ultimately deprecate Popen itself.

I don't see a use case where you programmatically compute the value of shell: it's always known as a function of how you're building up args. And even if such a rare case exists, you could select between the two APIs to call.

Whether at this point we can make this change is another issue, of course. I'm just trying to get at the best design. But if it's done over a number of releases I think it's doable.
msg100207 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2010-02-28 16:18
That seems reasonable. We already have subprocess.call, the thin wrapper around Popen. Maybe add this as subprocess.call_shell and call_exec?
msg100209 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-02-28 16:30
Hmm.  I liked Eric's idea, and it would be easier to get in, but 'call' is actually an argument against it.  It would mean that in addition to PopenExec and PopenShell we'd need call_exec and call_shell, and check_call_exec and check_call_shell.  Proliferating interfaces to handle a boolean parameter doesn't seem minimalist.
msg107310 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2010-06-08 10:49
I noticed this a while ago.  And FWIW it's not just 3.x; I see this same behavior in 2.6.

I've whipped up a test case, attached, which runs in both Python 2 and Python 3.  The test runs "sys.interpreter -V" four times: it tries all the combinations of shell=True/False and the cmdline argument as either a list or a string.

When shell=True and cmdline is an array, Python runs as an interactive shell.  (You have to press Ctrl-D to close it; the test warns you this is happening.)
When shell=False and cmdline is a string, Python raises OSError, "[Errno 2] No such file or directory".
In the other two cases the code runs correctly, producing the version number of Python.

The test would probably be better if I used stdin=PIPE too, so you didn't have to press Ctrl-D.  But I couldn't figure out how to make it produce the interactive session stuff when I did that.  Anyway, it wouldn't be hard to make this into a proper regression test.

My two cents: I assume from the discussion that this is not fixable; that is, it's not possible to have a shell and a string, or no shell and a list.  Failing that, the right decision is what R. David discussed on 2010-02-28: ignore the shell argument passed in and infer it ourselves from whether cmdline is a list or a string.  I don't buy the security argument--since the invalid combinations of shell and cmdline mean the function doesn't work, existing callers have already made the decision whether they can live with shell=True.  We should deprecate the shell argument and update the documentation to make the behavior crystal clear.
msg107311 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2010-06-08 10:51
I realize we're down to the wire, but would it be too late to fix this in 2.7?  It is a genuine bug, and it won't break any correct code.
msg107313 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2010-06-08 11:11
Sorry for spamming updates, but here's two more things to consider.

1) What does it do on Windows?  For all I know all four combinations work fine there and we should preserve the existing functionality.

2) All four combinations work fine if you call a program without arguments.  I tried it with both "/bin/ls" and sys.executable and it worked like a champ.  The problem *only* happens if you specify command-line arguments.
msg107316 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-06-08 11:42
Unless we go the proliferating-interfaces route, it does represent a behavior change, and so if accepted would need to go through a deprecation cycle.  And if we did go that route, it would be a new feature.  So nothing can happen in 2.7, since it is already in the RC stage.

By the way, I did not suggest dropping the shell parameter, I argued against doing that.  The 'security' I referred to is that when you use a shell you are subject to shell-metacharacter-based attacks (and bugs!) if any elements of the command line come from user input and you don't sanitize them.  This problem doesn't exist with shell=False, which is why it is the default.
msg107322 - (view) Author: Zsolt Cserna (csernazs) Date: 2010-06-08 13:30
I would say that both string and list should be accepted in args, and depending on the shell parameter, the module should create a list or a string from the specified list/string.

We already have a list2cmdline function to convert a list to string, we would only need to create the inverse of this function to convert a string to a list.

Executing a program whose parameters are coming from external source (eg. user input) can result security problems if those are not specified as a list and in this case I would be really happy to specify my parameters as a list to Popen and it would do the appropriate conversions as above.
msg107353 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2010-06-08 21:45
Zsolt: an excellent idea!  That shouldn't change the semantics of the function, and therefore is strictly a bug fix.  Which means we could potentially backport it, yes?

Obviously we only need to change it between string / list if we're going to have this problem, and that only happens when there are parameters.  So the change could be very narrow in scope.  Lovely!

R. David: Why did you remove Python 2.7 from the list?  We see the behavior there too.
msg107365 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-06-08 23:57
Because we use the versions field to indicate which versions a patch will be applied to, if it is, and I created this bug as a feature request, and as such it is not a candidate for 2.7.

Changing list to string for shell=True is a behavior change (currently excess list elements are passed to the shell...and I don't *know* that that is useless, someone might be exploiting it) and hard to get right.  Changing string to list might be possible on unix using shlex, but is a distinctly non-trivial change, would need to be considered carefully with regard to its implications, and I have no idea what it would mean on Windows.

Since 2.7 is in RC status, there is no way such a fix will be accepted for 2.7.  You can try to make a case for this proposal as a bug fix for 2.7.1, but I'm not sanguine about its chances.  Personally I'm -0.5 on such a change.  I think it is better to leave the control of the command formatting in the hands of the programmer.  Doing such a conversion is too close to guessing for my taste.

If you want a wider selection of opinions you can post the issue to python-dev and request feedback.
msg107369 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-06-09 00:15
I agree with David. For the issue raised here, at most I would make (list, shell=True) and (str, shell=False) raise errors.

There's an issue (that I can't find right now) for creating functions that convert from str->list and list->str for cases such as this. shlex is sort of str->list, but it has issues on Windows. I've mentioned elsewhere that for Windows list->str is not always possible, because there is no standard for the corresponding str->list that each program is responsible for.

On Unix-like systems, when the called programs are run they get a pre-parsed list (argv). This list is created by the caller, either directly or through a shell. When a shell is doing this, at least the behavior is somewhat standard.

On Windows, the called programs get a string, and they're in charge of parsing it, if they want to. Many programs use the C library to create argv, but exactly how that parsing works changes between runtime vendors and over time within a vendor. There's no solution that will work in all cases.

Which is not to suggest that we shouldn't try, just that it will be hard.
msg107770 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-06-14 01:56
See issue 8972 for motivation for accepting this proposal.  Maybe I should start working on the patch :)
msg111337 - (view) Author: Steve Holden (holdenweb) * (Python committer) Date: 2010-07-23 14:38
The test program Larry provided does not appear to function as intended on Windows, and runs without either producing error messages or requiring interactive input. Here's a typical output, in this case from Python 3.1 on Vista:

C:\Users\sholden\Documents\issue7839>\python31\python test_subprocess.py
Testing with shell= True and array= True
  ** This is the one that runs an interactive shell.
  ** You should press Ctrl-D.
        Output:
        [] Python 3.1.1
Testing with shell= True and array= False
        Output:
        [] Python 3.1.1
Testing with shell= False and array= True
        Output:
        [] Python 3.1.1
Testing with shell= False and array= False
        Output:
        [] Python 3.1.1

The same behavior was observed on 2.6 and 2.5.2.

[I also removed "Christophe Simonis" from the nosy list as the tracker was complaining that there was no such user].
msg157217 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-03-31 18:23
I like Eric's proposition, e.g. raising error if  (list, shell=True) or (str, shell=False)

If nobody object I can try to make initial patch for that.
msg157227 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2012-03-31 20:11
While I still think raising those errors is a good thing, we should also look at Nick's shell-command:
http://shell-command.readthedocs.org/en/latest/index.html
msg157228 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-03-31 20:21
Nick's library is awesome and I +1 to include it into stdlib if Nick is ready to do.

But also I like to prevent obviously bad usage of popen.
We cannot and don't want to remove popen shell=True param, so let's add raising exception for useless parameters combination.
msg157230 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-03-31 21:40
Yes, Nick's library looks good, but that should be a separate issue, it isn't really relevant to this one.
msg157252 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-04-01 03:28
For the record, my shell access convenience API proposal is in issue 13238.

I'm not going to push that for 3.3 though - convenient shell access from Python is currently an area with a lot of active third party development, so I think it's worthwhile continuing to be conservative on this topic rather than rushing ahead with blessing any particular approach for stdlib inclusion.

To get back to RDM's specific proposal (that shell=True/False should constrain the acceptable types for the first positional argument to str/list respectively), I'm -1 on the idea.

The reason I'm not a fan is the fact that, with "shell=True", you can use the *executable* argument to Popen to select a non-default shell. At that point, passing a list can make sense, even if it isn't useful for the default shell. On Windows, the implicit invocation of list2cmdline can already make this a useful thing to do.

For the other way around, passing a string with "shell=False" can be a straightforward way to launch GUI applications from a script. Specifying executable directly can also have an effect on this case, too (although I forget the consequences)

Now, what may make sense is to provide separate Popen.exec and Popen.shell *class methods* that correspond to shell=False and shell=True with the stricter type checking on the first argument (i.e. Popen.exec only accepts a list, POpen.shell only accepts a string).

Another advantage of adding the two new class methods is that it would give us the opportunity to *change some of the defaults* (e.g making close_fds=True the default behaviour).
msg157259 - (view) Author: Chris Rebert (cvrebert) * Date: 2012-04-01 06:58
> The reason I'm not a fan is the fact that, with "shell=True", you can use the *executable* argument to Popen to select a non-default shell. At that point, passing a list can make sense, even if it isn't useful for the default shell.

Modulo Windows, at that point, why not just run the shell explicitly (i.e. shell=False, args=['my_sh', ...])? Also, a '-c' argument will be forcibly prepended to the passed-in `args` in your case, which may frustrate such use-cases.

> For the other way around, passing a string with "shell=False" can be a straightforward way to launch GUI applications from a script.

And they may need to eventually pass it (an) argument(s), and when that happens, cue confusion! There's a reason the `subprocess` docs feature the not-strictly-necessary clause "[a str `args` w/ shell=False] will only work if the program is being given no arguments". The distinction regularly trips/tripped users up.
msg157269 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-04-01 10:06
After trying to make a patch I found what current test suite itself has calls like (str, shell=False), (bytes, shell=True) and (['shell command'], shell=True).

We can:
1. Implement Eric's suggestion with fixing/removing broken tests.
2. Add Pope.shell and Popen.exec as Nick propose.

I don't like former because it changes already existing contracts fixed and published by explicit unittests without strong reason. Also Nick pointed reasons why user may need existing functionality. Adding new 'safe' shell and exec classmethods as prime entrypoints for Popen make sense.
msg157294 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-04-01 14:40
Which just goes to show that using Popen correctly is not obvious, I suppose.

Given that adding these errors *would* break backward compatibility, there would have to be a deprecation if it was done.

Personally I don't see the point in adding new class methods (complicating the interface) unless we are going to deprecate the current methods...in which case why not just deprecate the incorrect way of calling it?  (Note: I say incorrect because the 'call a different shell' variant *doesn't work* unless you write a custom shell that ignores '-c', and as Chris points out there's no functionality *gained* by the windows variation, and there is confusion introduced as to exactly what is going on).

So, if we aren't going to deprecate the "incorrect" calls, I vote for just closing this issue as "won't fix".

(I suppose that a third alternative exists: make the "incorrect" calls actually do something useful while preserving backward compatibility with the existing behavior.  That may be difficult, especially doing it in a way that logically consistent cross-platform.)
msg157322 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-04-01 18:10
BTW we need to drop win9x and win2000 support, see #14470
msg157323 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-04-01 18:12
I'm +1 for going though deprecation process for Popen args to make parameters combination clean and obvious.
msg157332 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-04-01 22:48
Everyone missed my other argument in favour of alternate constructor
methods: fixing the currently wrong default arguments.

There is no good reason to break working code when beginner confusion can
be better addressed by telling them to avoid calling the Popen constructor
directly in favour of newer APIs that have benefitted from years of
experience with the current API.
msg157608 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-04-05 18:23
What's about test which pass bytes to Popen?
Should it be deprecated and should Popen accept only unicode strings only — I mean `str` type?
As I know the trend of py3k to get rid of bytes in filesystem names.
msg157613 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-04-05 20:02
No, you need to be able to pass bytes to Popen, just like you do to the os.exec[xx] functions.  When the OS doesn't fully support unicode, that is sometimes the only option.  As for filenames; again, as long as the underlying systems use bytes filenames we need to support it.  Currently we encode them when received using surrogateescape and decode them back to bytes when used. 

I am not sure what os.exec[xx] does with strings containing non-ascii.  Presumably it uses some default encoding or other, which seems to be utf-8 on my system.  (It doesn't seem to be explicitly documented where those functions are discussed in the os module.)
msg157800 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2012-04-08 17:50
If fylesystem doesn't support unicode (only Windows directly does as I know) str filename should to be converted by `sys.getfilesystemencoding()`. `os.exec` family already does it as well as other fs functions — but they are supports bytes also.
Mayby deprecation bytes for high-level Popen is good idea.
msg160587 - (view) Author: Chris Rebert (cvrebert) * Date: 2012-05-14 00:22
Re: Nick's comments:

Besides `close_fds`, what other Popen parameters currently have suboptimal defaults?
History
Date User Action Args
2013-11-06 19:37:27akuchlingsetkeywords: - easy
2012-05-14 00:22:33cvrebertsetmessages: + msg160587
2012-04-08 17:50:20asvetlovsetmessages: + msg157800
2012-04-05 20:02:39r.david.murraysetmessages: + msg157613
2012-04-05 18:23:37asvetlovsetmessages: + msg157608
2012-04-01 22:48:22ncoghlansetmessages: + msg157332
2012-04-01 18:12:33asvetlovsetmessages: + msg157323
2012-04-01 18:10:06asvetlovsetmessages: + msg157322
2012-04-01 14:40:49r.david.murraysetmessages: + msg157294
2012-04-01 10:06:49asvetlovsetmessages: + msg157269
2012-04-01 06:58:08cvrebertsetmessages: + msg157259
2012-04-01 03:28:05ncoghlansetmessages: + msg157252
2012-03-31 21:40:33r.david.murraysetmessages: + msg157230
2012-03-31 20:21:17asvetlovsetnosy: + ncoghlan
messages: + msg157228
2012-03-31 20:11:12eric.smithsetmessages: + msg157227
2012-03-31 18:45:36cvrebertsetnosy: + cvrebert
2012-03-31 18:23:35asvetlovsetmessages: + msg157217
2011-10-17 15:50:24eric.araujosetnosy: + eric.araujo

versions: + Python 3.3, - Python 3.2
2011-10-17 13:58:17Arfreversetnosy: + Arfrever
2011-03-24 05:00:15asvetlovsetnosy: + asvetlov
2011-03-03 14:42:48hayposetnosy: + haypo
2010-07-31 18:23:59dstaneksetnosy: + dstanek
2010-07-23 14:38:20holdenwebsetnosy: + holdenweb, - Christophe Simonis
messages: + msg111337
2010-06-14 01:56:49r.david.murraysetmessages: + msg107770
2010-06-14 01:55:21r.david.murraylinkissue8972 superseder
2010-06-09 18:49:30giampaolo.rodolasetnosy: + giampaolo.rodola
2010-06-09 00:15:36eric.smithsetmessages: + msg107369
2010-06-08 23:57:06r.david.murraysetmessages: + msg107365
2010-06-08 21:45:37larrysetmessages: + msg107353
2010-06-08 13:30:26csernazssetnosy: + csernazs
messages: + msg107322
2010-06-08 11:42:35r.david.murraysetmessages: + msg107316
versions: - Python 2.7
2010-06-08 11:11:45larrysetmessages: + msg107313
2010-06-08 10:51:25larrysetversions: + Python 2.7
2010-06-08 10:51:03larrysetmessages: + msg107311
2010-06-08 10:49:24larrysetfiles: + test_subprocess.py
nosy: + larry
messages: + msg107310

2010-02-28 16:30:18r.david.murraysetmessages: + msg100209
2010-02-28 16:18:21brian.curtinsetnosy: + brian.curtin
messages: + msg100207
2010-02-28 16:12:48eric.smithsetmessages: + msg100206
2010-02-28 15:25:55r.david.murraysetmessages: + msg100205
2010-02-28 10:08:50Christophe Simonissetnosy: + Christophe Simonis
messages: + msg100202
2010-02-25 00:49:14Alexander.Belopolskysetnosy: + Alexander.Belopolsky
2010-02-02 20:49:04eric.smithsetnosy: + eric.smith
2010-02-02 19:08:03r.david.murraycreate