classification
Title: list2cmdline function in subprocess module handles \" sequence wrong
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.3, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Arve.Knudsen, exarkun, piotr.dobrogost, r.david.murray, sbt
Priority: normal Keywords:

Created on 2013-08-03 22:00 by piotr.dobrogost, last changed 2013-08-04 18:51 by piotr.dobrogost. This issue is now closed.

Messages (10)
msg194307 - (view) Author: Piotr Dobrogost (piotr.dobrogost) Date: 2013-08-03 22:00
According to the docstring of list2cmdline function in subprocess module the sequence of a backslash followed by a double quote mark should denote double quote mark in the output string. However it's not the case

Python 2.7.4 (default, Apr  6 2013, 19:55:15) [MSC v.1500 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import subprocess
    >>> print subprocess.list2cmdline(r'\"1|2\"')
    \ \" 1 | 2 \ \"

The same behavior is in Python 3.3.1.

See "On Windows, how can I protect arguments to shell scripts using Python 2.7 subprocess?"(http://stackoverflow.com/q/4970194/95735) question on Stack Overflow.
msg194376 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-08-04 13:55
Firstly, list2cmdline() takes a list as its argument, not a string:

  >>> import subprocess
  >>> print subprocess.list2cmdline([r'\"1|2\"'])
  \\\"1|2\\\"

But the problem with passing arguments to a batch file is that cmd.exe parses arguments differently from how normal executables do.  In particular, "|" is treated specially and "^" is used as an escape character.

If you define test.bat as

  @echo off
  echo "%1"

then

  subprocess.call(['test.bat', '1^|2'])

prints

  "1|2"

as expected.

This is a duplicate of http://bugs.python.org/issue1300.
msg194381 - (view) Author: Piotr Dobrogost (piotr.dobrogost) Date: 2013-08-04 14:39
I think you're missing the point. The implementation is wrong as it does not do what documentation says which is "A double quotation mark preceded by a backslash is interpreted as a literal double quotation mark." How the output of list2cmdline interacts with the cmd.exe is another issue (It just happens here that if implementation of list2cmdline were in line with its documentation then there wouldn't be any subsequent problem with cmd.exe). Also issue 1300 is about escaping a pipe character (|) on the basis of how it's treated by cmd.exe and does not even refer to the docstring of list2cmdline function.
msg194389 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-08-04 16:12
This is a only a duplicate of issue 1300 in the sense that that issue points out that list2cmdline has nothing to do with passing/quoting strings for cmd.exe.

list2cmdline is an internal function of the subprocess module.  Its docstring documents the MS C quoting rules, *not* the input quoting rules.  So its output is correct according to its doc string.  If you pass ["test.bat", r'\"1|2\"'] to Popen using Richard's version of test.bat, you should get

  \"1|2\"

as the output, which would be correct, since that is what you passed in as the argument to test.bat in the Popen call.  The point is that the arguments specified in the list (shell=False) Popen call is supposed to be exactly what arguments get passed to the called program, and list2cmdline takes care of the MS C quoting to make that happen. (I don't use Windows much, so it is a bit of a pain for me to confirm the above example, but I'm nearly certain it will work as I say, modulo whatever quoting rule 'echo' uses for output.)
msg194390 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-08-04 16:18
The first line above is incomplete.  I meant that issue 1300 is only a duplicate in the sense that it points out that list2cmdline implements the MS C quoting rules, *not* the cmd.exe quoting rules.
msg194391 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2013-08-04 16:21
> I think you're missing the point. The implementation is wrong as it 
> does not do what documentation says which is "A double quotation mark 
> preceded by a backslash is interpreted as a literal double quotation 
> mark."

That docstring describes how the string returned by list2cmdline() is interpreted by the MS C runtime.  I assume you mean this bit:

    3) A double quotation mark preceded by a backslash is
       interpreted as a literal double quotation mark.

This looks correct to me: it implies that list2cmdline() must convert a double quotation mark to a double quotation mark preceded by a backslash.  e.g.

  >>> print(subprocess.list2cmdline(['"']))
  \"

> How the output of list2cmdline interacts with the cmd.exe is another 
> issue (It just happens here that if implementation of list2cmdline were 
> in line with its documentation then there wouldn't be any subsequent 
> problem with cmd.exe).

As I said, list2cmdline() behaves as expected.  Whatever else happens, "|" must be escaped with "^" or else cmd will interpret it specially.
msg194393 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-08-04 16:39
The list form of Popen should never be used with shell=True.

It would be very good if someone would propose a 'cmd.exe quote' function for the stdlib.

But both of these points don't have anything to do with this issue, as far as I can see :)
msg194405 - (view) Author: Piotr Dobrogost (piotr.dobrogost) Date: 2013-08-04 18:25
The docstring starts with this statement
"Translate a sequence of arguments into a command line string, using the same rules as the MS C runtime:"
which clearly makes the impression that function list2cmdline uses the same rules as the MS C runtime. However after reading comments in this issue I believe I misunderstood the true meaning as the docstring is highly misleading. According to your comments, the word "using" was meant to pertain to the "command line string" (which is the output of the list2cmdline function) and _not_ to the translation phase itself. This makes sense taking into account the flow of events which is; a list of arguments -> list2cmdline -> CreateProcess.
msg194407 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-08-04 18:32
"Using the same rules as the MS C runtime" means that, given a sequence (list) of arguments, create a string that uses the same quoting that the MS C runtime uses.  That is, if you have a sequence of arguments in a C program, and you want to call another windows program, you quote that sequence into a string using the same rules that list2cmdline uses.

Can you suggest a wording for the doctring that would make this clearer?
msg194408 - (view) Author: Piotr Dobrogost (piotr.dobrogost) Date: 2013-08-04 18:51
Sure, something like
"The purpose of this function is to construct a string which will be later interpreted by MS C runtime as denoting a sequence of arguments. Because of this the string is built in such a way as to preserve the original characters when interpreted by MS C runtime. For example a double quotation mark is preceded by the backslash so that MS C runtime would translate this back to the original double quotation mark according to its rules which are given below as a reference:"
History
Date User Action Args
2013-08-04 18:51:09piotr.dobrogostsetmessages: + msg194408
2013-08-04 18:32:12r.david.murraysetmessages: + msg194407
2013-08-04 18:26:42piotr.dobrogostsetstatus: open -> closed
resolution: not a bug
2013-08-04 18:25:41piotr.dobrogostsetstatus: closed -> open
resolution: not a bug -> (no value)
messages: + msg194405

components: + Library (Lib), - Benchmarks
2013-08-04 16:39:09r.david.murraysetmessages: + msg194393
2013-08-04 16:21:56sbtsetmessages: + msg194391
2013-08-04 16:18:29r.david.murraysetmessages: + msg194390
2013-08-04 16:12:55r.david.murraysetstatus: open -> closed
resolution: not a bug
messages: + msg194389

components: + Benchmarks, - Library (Lib)
2013-08-04 14:39:06piotr.dobrogostsetstatus: closed -> open
resolution: not a bug -> (no value)
messages: + msg194381
2013-08-04 13:55:36sbtsetstatus: open -> closed

type: behavior

nosy: + sbt
messages: + msg194376
resolution: not a bug
stage: resolved
2013-08-03 22:00:42piotr.dobrogostcreate