classification
Title: Add shlex.quote
Type: enhancement Stage: committed/rejected
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: eric.araujo Nosy List: anacrolix, brandon-rhodes, eric.araujo, eric.smith, ezio.melotti, georg.brandl, haypo, python-dev, r.david.murray, xuanji
Priority: normal Keywords: patch

Created on 2010-08-31 13:40 by brandon-rhodes, last changed 2012-04-18 17:13 by eric.araujo. This issue is now closed.

Files
File name Uploaded Description Edit
shlex.quote.diff eric.araujo, 2011-07-18 16:22 review
Messages (32)
msg115263 - (view) Author: Brandon Rhodes (brandon-rhodes) * Date: 2010-08-31 13:40
The only way to safely build shell command lines from inside of Python — which is necessary when sending commands across SSH, since that behaves like os.system() rather than like subprocess.call() — is to use the wonderful pipes.call() method to turn possibly-dangerous arguments, like filenames that might have spaces, special characters, and embedded "rm -r" calls, into perfectly quoted strings for an "sh"-like shell (say, bash or zsh).

This call is already recommended on mailing lists, blog posts, and Stack Overflow — and since it doesn't start with a "_", I think its public use is fair game. But the "pipes" documentation itself doesn't officially mention or support it. I think it should be added to the Standard Library documentation for "pipes". So. Yeah.
msg115270 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2010-08-31 17:38
I think you mean pipe.quote in your message, not pipe.call. The subject looks correct.

I'm not sure pipes is the best place for this, but I agree it should probably be documented in older versions. It seems to me we've had this discussion before about quoting command lines, how it applies differently between Windows and various shells, and which functions to expose. But having said that, I can't find a previous issue that discusses it.

Not the least of my concerns is that pipes says it's available on Unix systems, despite the fact that I have it on a Windows machine. And I might need the functionality of passing a bash command from a Windows machine to a Unix machine, so we definitely need this cross platform.
msg115463 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-09-03 16:33
Eric referred to this thread: http://mail.python.org/pipermail/stdlib-sig/2010-May/thread.html
msg126674 - (view) Author: Matt Joiner (anacrolix) Date: 2011-01-21 01:32
I agree, I discovered this function (pipes.quote) only through recommendation here:

http://stackoverflow.com/questions/4748344/whats-the-reverse-of-shlex-split

I suggest that it be added to shlex, perhaps as shlex.quote. While the quoting style it performs, and the module it's found in are specific to Unix tools, the shlex module is available on non-Unix platforms. To this end, adding the function to shlex would make it available elsewhere, as well as be appropriately shell-related.
msg126768 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-01-21 18:51
Why do you want to move quote from pipes to shlex?  The function is available, the issue here is lack of documentation.
msg126830 - (view) Author: Matt Joiner (anacrolix) Date: 2011-01-22 11:01
Two reasons: The pipes module is Unix only, but pipes.quote is useful on all platforms. Secondly pipes.quote pertains to shell command-lines, this is also the domain of shlex which already cross platform. In pipes, an import shlex.quote would more than sufficient.

If this belongs in another separate bug I shall submit one. Please advise.
msg126850 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-01-22 18:42
Even if quote does not start with an underscore, its absence from the docs and from the module’s __all__ make it a private function.

I propose to make it public as shlex.quote in 3.3 and deprecate pipes.quote for people that relied on it (i.e. move code from pipes/test_pipes to shlex/test_shlex, add doc, write a pipes.quote function that sends a DeprecationWarning and calls shlex.quote).
msg126863 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-22 23:03
Rather than doing a code deprecation you can just do 'from shlex import quote' in pipes (with a comment about backward compatibility).  I don't think there is any real harm in leaving that kind of backward compatibility in place indefinitely.
msg126936 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-01-24 17:19
> you can just do 'from shlex import quote' in pipes

We would have had to do that anyway, since pipes needs to use the function.  Agreed that a deprecation is not necessary.
msg126939 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-01-24 18:03
Yes, I know you have to do it anyway, that's why I used the adverb 'just' (as in 'just that, and nothing more')  (I note that 'just' as an adverb tends to get overused by English speakers, and so loses some of its semantic value... :)
msg126950 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-01-24 19:55
Putting quote() into shlex sounds good to me.
msg127957 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-02-05 00:57
A note from Ian Bicking in http://mail.python.org/pipermail/stdlib-sig/2010-May/000948.html:

> I've had to do this sort of thing for similar reasons, like calling remote
> ssh commands, where a shell command is embedded in a positional argument.  I
> implemented my own thing, but the way pipes.quote works would have been
> nicer (the implementation could use a couple regexes instead of iterating
> over strings though).
msg134638 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-04-28 03:04
Someone on Stack Overflow suggested subprocess.list2cmdline.  It’s one of those functions undocumented and absent from __all__ that somehow get found and used by some people.  So when we make the patch, let’s include links to the new function doc from subprocess and pipes, and reply to the Stack Overflow thread.
msg134834 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-04-30 02:54
See also #11827 about subprocess.list2cmdline.
msg140601 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-07-18 16:22
Here’s the patch, please review.
msg141350 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-29 12:35
New changeset 5966eeb0457d by Éric Araujo in branch 'default':
Add shlex.quote function, to escape filenames and command lines (#9723).
http://hg.python.org/cpython/rev/5966eeb0457d
msg141368 - (view) Author: Roundup Robot (python-dev) Date: 2011-07-29 13:10
New changeset 43c41e19527a by Éric Araujo in branch 'default':
Expand shlex.quote example (#9723)
http://hg.python.org/cpython/rev/43c41e19527a
msg141374 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-07-29 13:35
I suck at regexes, but the one I came up with lets quote pass the tests that existed for pipes.quote, so I figure it’s good.  I did not change the string operations at the end of the function (+ and replace) as it was simple and working.

In the absence of review or opposition here, I have committed my patch.  Feedback welcome.
msg141379 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-29 13:47
Sorry I didn't look at the patch earlier.  I thought we were just *moving* pipes.quote.  Why is a new regex involved?
msg141380 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-07-29 13:51
Because I choose to follow Ian’s remark in msg127957:
> the implementation could use a couple regexes instead of iterating
> over strings though

I have not touched the tests, so I felt confident with my regex.  FWIW, I’ve also changed the argument name, since the function is not limited to file names (as I hopefully made clear in the doc).
msg141383 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-29 14:44
Well, it's a micro-optimization (it would be interesting to benchmark, but not worth it).  I find the original code much more readable than the regex, but it doesn't matter all that much.  (And as far as optimization goes, using translate might be even faster.)
msg141386 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-07-29 15:11
str.translate is not an option, as the code does not replace but add characters (quotes).

Out of sheer curiosity, I may actually benchmark this.
msg141387 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-29 15:20
You aren't using a regex to replace the quotes, either.

>>> len('abcd'.translate(str.maketrans('', '', string.ascii_letters ))) > 0
False

I don't know if this is faster than the corresponding search regex, but depending on how much overhead regex has, it might be.  Note that the translate table can be pre-built, just like the compiled regex.
msg141430 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-07-30 07:20
+_find_unsafe = re.compile(r'[^\w\d@%_\-\+=:,\./]').search

\w already includes both \d and _, so (unless you really want to be explicit about it) they are redundant.  Also keep in mind that they match non-ASCII letters/numbers on Python 3.
'+' and '.' don't need to be escaped in a character class (i.e. [...]), because they lose their meta-characters meaning there.
'-' is correctly escaped there, but it's common practice to place it at the end of the character class, where it doesn't need escaping.

r'[^\w\d@%_\-\+=:,\./]' and r'[^\w@%+=:,./-]' should therefore be equivalent.
msg141439 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-07-30 13:16
> \w already includes both \d and _, so (unless you really want to be
> explicit about it) they are redundant. [snip]
I just started from the previous list and turned it into a regex.  Eliminating redundancy sounds good, I’ll use your version.

> Also keep in mind that they match non-ASCII letters/numbers on
> Python 3.
This was not covered by the previous tests, but I think it’s fine: the shell is concerned about some metacharacters and spaces, not Unicode letters.  To be 100% sure, I will add tests with Unicode characters for pipes.quote in 3.2, and when merging with 3.3 I’ll know if the new code still complies.
msg141963 - (view) Author: Roundup Robot (python-dev) Date: 2011-08-12 16:03
New changeset 8032ea4c3619 by Éric Araujo in branch '3.2':
Test pipes.quote with a few non-ASCII characters (see #9723).
http://hg.python.org/cpython/rev/8032ea4c3619

New changeset 6ae0345a7e29 by Éric Araujo in branch 'default':
Avoid unwanted behavior change in shlex.quote (see #9723).
http://hg.python.org/cpython/rev/6ae0345a7e29
msg141964 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-08-12 16:05
I have restored compatibility (see commit messages).
msg141997 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-08-12 23:21
-_find_unsafe = re.compile(r'[^\w\d@%_\-\+=:,\./]').search
+_find_unsafe = re.compile(r'[^\w@%\-\+=:,\./]', re.ASCII).search

FWIW there are still unnecessary escapes before '+' and '.', and possibly '-' ('-' doesn't need escaping only when it's at the end (or beginning) of the regex).
msg142006 - (view) Author: Matt Joiner (anacrolix) Date: 2011-08-13 03:00
Why can't pipes.quote can't be moved to shlex.quote verbatim as I originally proposed?

Is there justification to also change it as part of the relocation? I think any changes to its behaviour should be a separate issue.
msg142203 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-08-16 15:54
> FWIW there are still unnecessary escapes before '+' and '.', and
> possibly '-'

This is IMO cosmetic and not as “important” as the duplicate characters already implied by the character class.  Feel free to change it.

> Why can't pipes.quote can't be moved to shlex.quote verbatim as I
> originally proposed?

I took the opportunity of changing some convoluted code.
msg142204 - (view) Author: Roundup Robot (python-dev) Date: 2011-08-16 16:03
New changeset 5d4438001069 by Ezio Melotti in branch 'default':
#9723: refactor regex.
http://hg.python.org/cpython/rev/5d4438001069
msg158642 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-04-18 17:13
See #14616 for a doc edition related to this.
History
Date User Action Args
2012-04-18 17:13:06eric.araujosetmessages: + msg158642
2011-08-16 16:03:54python-devsetmessages: + msg142204
2011-08-16 15:54:15eric.araujosetmessages: + msg142203
2011-08-13 03:00:36anacrolixsetmessages: + msg142006
2011-08-12 23:21:01ezio.melottisetmessages: + msg141997
2011-08-12 16:05:02eric.araujosetstatus: open -> closed

messages: + msg141964
stage: test needed -> committed/rejected
2011-08-12 16:03:54python-devsetmessages: + msg141963
2011-07-30 13:16:21eric.araujosetstatus: closed -> open

messages: + msg141439
stage: committed/rejected -> test needed
2011-07-30 07:20:17ezio.melottisetmessages: + msg141430
2011-07-29 15:20:13r.david.murraysetmessages: + msg141387
2011-07-29 15:11:49eric.araujosetmessages: + msg141386
2011-07-29 14:44:25r.david.murraysetmessages: + msg141383
2011-07-29 13:51:58eric.araujosetmessages: + msg141380
2011-07-29 13:47:49r.david.murraysetmessages: + msg141379
2011-07-29 13:35:57eric.araujosetstatus: open -> closed
resolution: fixed
messages: + msg141374

stage: patch review -> committed/rejected
2011-07-29 13:10:19python-devsetmessages: + msg141368
2011-07-29 12:35:19python-devsetnosy: + python-dev
messages: + msg141350
2011-07-18 16:22:15eric.araujosetfiles: + shlex.quote.diff
messages: + msg140601

assignee: eric.araujo
keywords: + patch
stage: needs patch -> patch review
2011-04-30 02:54:16ezio.melottisetnosy: + ezio.melotti
messages: + msg134834
2011-04-28 03:04:02eric.araujosetmessages: + msg134638
2011-04-27 15:30:17xuanjisetnosy: + xuanji
2011-03-03 14:36:10hayposetnosy: + haypo
2011-02-05 00:57:18eric.araujosetnosy: georg.brandl, eric.smith, eric.araujo, r.david.murray, brandon-rhodes, anacrolix
messages: + msg127957
2011-01-24 19:55:14georg.brandlsetnosy: + georg.brandl
messages: + msg126950
2011-01-24 18:03:53r.david.murraysetnosy: eric.smith, eric.araujo, r.david.murray, brandon-rhodes, anacrolix
messages: + msg126939
2011-01-24 17:19:16eric.araujosetnosy: eric.smith, eric.araujo, r.david.murray, brandon-rhodes, anacrolix
messages: + msg126936
2011-01-22 23:03:20r.david.murraysetnosy: + r.david.murray
messages: + msg126863
2011-01-22 18:42:03eric.araujosetassignee: docs@python -> (no value)
type: enhancement

components: + Library (Lib), - Documentation
title: pipes.quote() needs to be documented -> Add shlex.quote
nosy: - docs@python
versions: + Python 3.3, - Python 3.1, Python 2.7, Python 3.2
messages: + msg126850
2011-01-22 11:01:36anacrolixsetnosy: eric.smith, eric.araujo, brandon-rhodes, anacrolix, docs@python
messages: + msg126830
2011-01-21 18:52:38eric.araujosetnosy: eric.smith, eric.araujo, brandon-rhodes, anacrolix, docs@python
stage: test needed -> needs patch
2011-01-21 18:51:09eric.araujosetnosy: eric.smith, eric.araujo, brandon-rhodes, anacrolix, docs@python
type: enhancement -> (no value)
messages: + msg126768
stage: test needed
2011-01-21 01:32:44anacrolixsetnosy: + anacrolix
messages: + msg126674
2010-09-03 16:33:03eric.araujosetnosy: + eric.araujo

messages: + msg115463
versions: - Python 2.6, Python 2.5, Python 3.3
2010-08-31 17:38:41eric.smithsetnosy: + eric.smith
messages: + msg115270
2010-08-31 13:40:13brandon-rhodescreate