Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subprocess.check_output() docs misrepresent what shell=True does #64543

Open
klausman mannequin opened this issue Jan 22, 2014 · 12 comments
Open

subprocess.check_output() docs misrepresent what shell=True does #64543

klausman mannequin opened this issue Jan 22, 2014 · 12 comments
Labels
docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@klausman
Copy link
Mannequin

klausman mannequin commented Jan 22, 2014

BPO 20344
Nosy @bitdancer, @vadmium
Files
  • issue20344.patch: Patch to issue 20344
  • documentation20344.patch
  • 20344_3.patch
  • 20344_4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2014-01-22.14:25:04.959>
    labels = ['type-bug', 'library', 'docs']
    title = 'subprocess.check_output() docs misrepresent what shell=True does'
    updated_at = <Date 2015-06-20.06:25:51.399>
    user = 'https://bugs.python.org/klausman'

    bugs.python.org fields:

    activity = <Date 2015-06-20.06:25:51.399>
    actor = 'martin.panter'
    assignee = 'docs@python'
    closed = False
    closed_date = None
    closer = None
    components = ['Documentation', 'Library (Lib)']
    creation = <Date 2014-01-22.14:25:04.959>
    creator = 'klausman'
    dependencies = []
    files = ['34543', '34619', '34635', '39747']
    hgrepos = []
    issue_num = 20344
    keywords = ['patch']
    message_count = 12.0
    messages = ['208811', '208814', '214332', '214364', '214845', '214943', '214949', '214950', '243632', '243638', '243642', '245542']
    nosy_count = 5.0
    nosy_names = ['r.david.murray', 'docs@python', 'klausman', 'martin.panter', 'Tuomas.Savolainen']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue20344'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5', 'Python 3.6']

    @klausman
    Copy link
    Mannequin Author

    klausman mannequin commented Jan 22, 2014

    The subprocess docs state that the first argument can be either a string or an iterable that contains the program and arguments to run. It also points out that using shell=True allows for shell constructs. It does not mention a subtlety that is introduced by Python's use of sh -c (on Unix):

    Using a string that contains the full command line with args and shell=False breaks as expected:

    >>> subprocess.check_output("ls foo", shell=False)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib64/python3.3/subprocess.py", line 576, in check_output
        with Popen(*popenargs, stdout=PIPE, **kwargs) as process:
      File "/usr/lib64/python3.3/subprocess.py", line 824, in __init__
        restore_signals, start_new_session)
      File "/usr/lib64/python3.3/subprocess.py", line 1448, in _execute_child
        raise child_exception_type(errno_num, err_msg)
    FileNotFoundError: [Errno 2] No such file or directory: 'ls foo'

    Using shell=True instead works as expected (since foo does not exist, ls exits with non-0 and a different Exception is raised. This, too is to spec and exactly what the docs describe.

    >>> subprocess.check_output("ls foo", shell=True)
    ls: cannot access foo: No such file or directory
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib64/python3.3/subprocess.py", line 589, in check_output
        raise CalledProcessError(retcode, process.args, output=output)
    subprocess.CalledProcessError: Command 'ls foo' returned non-zero exit status 2

    Using shell=False and making the command a list of command and args does the same thing:

    >>> subprocess.check_output(["ls", "foo"], shell=False)
    ls: cannot access foo: No such file or directory
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib64/python3.3/subprocess.py", line 589, in check_output
        raise CalledProcessError(retcode, process.args, output=output)
    subprocess.CalledProcessError: Command '['ls', 'foo']' returned non-zero exit status 2

    But using an iterable _and_ requesting a shell results in something very broken:

    >>> subprocess.check_output(["ls", "foo"], shell=True)
    [contents of my homedir are returned]
    >>>

    Note that the argument "foo" completely disappears, apparently. strace() reveals that "foo" is never added to the call to "ls". Instead, it becomes $0 in the shell that ls runs in. This is exactly to spec (i.e. bash is not broken). "bash -c foo bar baz" will start a shell that sets $0 to "bar", $1 to "baz" and then executes "foo". Whereas "bash -c 'foo bar baz'" will run 'foo bar baz'.

    I think there are three possible fixes here:
    1- Make check_output(list, shell=True) run something like "bash -c '%s'" % " ".join(list)
    2- Change the docs to warn of the unintended consequences of combining lists with shell=True
    3- Make check_output() throw an Exception if the first argument is a list and shell=True

    The first option would make it easiest for people to "just use it", but I fear the string manipulation may become the source of bugs with security implications in the future.

    The second option keeps the status quo, but people tend to not read docs, so it may not be as effective as one would like, especially since shell-True already has warnings and people tend to go "yeah I know about unverified input, but it's not a problem for me" and then ignore both warnings.

    Option 3 has the disadvantage that existing scripts that _expect_ the behavior may break.

    @klausman klausman mannequin assigned docspython Jan 22, 2014
    @klausman klausman mannequin added docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jan 22, 2014
    @bitdancer
    Copy link
    Member

    This was discussed in bpo-6760. There I advocated (and still advocate) option (3). Unfortunately it looks like our doc update in that issue actually *lost* the documentation for your option (2) that used to be there, though it wasn't exactly clear what it meant when it was there.

    So IMO we should do a doc update to add that info back in a clearer form, and also do a deprecation cycle to disallow shell=True with a sequence. I haven't ever gotten consensus from the other devs on the latter, though.

    @TuomasSavolainen
    Copy link
    Mannequin

    TuomasSavolainen mannequin commented Mar 21, 2014

    Made a patch that throws exception as suggested:
    "3- Make check_output() throw an Exception if the first argument is a list and shell=True"

    @bitdancer
    Copy link
    Member

    Thanks Tuomas, but we don't have any consensus that that kind of change will be accepted. It's just my opinion that it should be...and if it was, it would have to start with a deprecation, not raising an exception.

    What we need as a patch for this issue is a documentation patch explaining why using shell=True with a list is a bad idea in the 'frequently used arguments' section. The paragraph about 'args' in the frequently used arguments section should say something like 'using shell=True with a list containing more than one element does not currently raise an error, but it will not do what you expect; only the first element of the list will be executed on unix, and on windows the quoting is likely to be incorrect. See the Popen constructor documentation below for a more precise description.'

    (You can see why I think this usage should be deprecated, I hope ;)

    @TuomasSavolainen
    Copy link
    Mannequin

    TuomasSavolainen mannequin commented Mar 25, 2014

    Created a patch that adds notice of using shell=True and iterable to the documentation. Please do comment if the formatting is wrong (this my first documentation patch).

    @klausman
    Copy link
    Mannequin Author

    klausman mannequin commented Mar 27, 2014

    Hi!

    On Tue, 25 Mar 2014, Tuomas Savolainen wrote:

    Created a patch that adds notice of using shell=True and iterable to the documentation. Please do comment if the formatting is wrong (this my first documentation patch).

    I'd use articles, i.e. "and a list" and "does not raise an error"

    Also, s/except/expect/

    Regards,
    Tobias

    @bitdancer
    Copy link
    Member

    Also, the "see below" sentence is missing.

    @TuomasSavolainen
    Copy link
    Mannequin

    TuomasSavolainen mannequin commented Mar 27, 2014

    Corrected the spelling of the patch.

    @vadmium
    Copy link
    Member

    vadmium commented May 19, 2015

    “Only the first element . . . will be executed on Unix” is a bit simplistic. The behaviour is already described more fully in the “shell” documentation under <https://docs.python.org/dev/library/subprocess.html#subprocess.Popen\>.

    Also, “args” is allowed to be a _sequence_, but an _iterable_ is a more general concept.

    @bitdancer
    Copy link
    Member

    Can you suggest a better phrasing that is still succinct? (Maybe just 'appear to be executed' would make it accurate enough?) But, that's why there needs to be a reference of *some* sort to the full explanation.

    I'm not sure what the issue is with iterable/sequence. check_output/Popen accepts an iterable.

    @vadmium
    Copy link
    Member

    vadmium commented May 20, 2015

    Actually David I didn’t notice your suggested wording <https://bugs.python.org/issue20344#msg214364\> before. Adding that last sentence, pointing to the full description of the Popen constructor, would be fine.

    My complaint about mentioning “iterable” is that iterables are not mentioned anywhere else in the documentation. I would rather leave it out or change it to “sequence”, unless the rest of the documentation is made consistent, e.g. “args . . . should be a string, or a sequence of program arguments” would also need fixing.

    @vadmium
    Copy link
    Member

    vadmium commented Jun 20, 2015

    I think it might be better to leave the platform-specific details in the full Popen description, not under Frequently Used Arguments. I suggest to use 20344_4.patch:

    • Move existing pointer to Popen constructor details up to top of section
    • Explain the escaping and quoting of a sequence only applies to Windows
    • Remove claim that a string simply names a program without arguments; appears to be only half true under Windows
    • Clarify shell=True command is normally a string, not a sequence
    • Drop new description of Unix shell argument sequence in favour of existing description under Popen
    • Move warning about Windows shell argument sequence under Popen

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants