This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: shlex.split() does not tokenize like the shell
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: vinay.sajip Nosy List: Andrey.Kislyuk, cvrebert, eric.araujo, eric.smith, ezio.melotti, python-dev, r.david.murray, robodan, vinay.sajip
Priority: normal Keywords: patch

Created on 2006-07-13 17:44 by robodan, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
ref_shlex.py robodan, 2011-11-25 21:03
test_shlex.diff robodan, 2011-11-25 21:03
changes-tests-docs.diff vinay.sajip, 2012-01-06 18:57 Patch showing changes, tests and docs review
changes-after-feedback.diff vinay.sajip, 2012-02-21 17:16 Changes after feedback from Éric review
changes-after-more-feedback.diff vinay.sajip, 2012-04-25 15:53 Changes following feedback from R. David Murray review
changes-after-yet-more-feedback.diff vinay.sajip, 2012-06-03 17:50 Changes following more feedback from R. David Murray review
incorporating-issue-21999.diff vinay.sajip, 2014-10-01 22:30 review
refresh-2016.diff vinay.sajip, 2016-07-22 11:02 Updated patch for 3.6 and incorporating SilentGhost's comments. review
Repositories containing patches
http://hg.python.org/sandbox/vsajip#fix1521950
Messages (32)
msg60940 - (view) Author: Dan Christian (robodan) Date: 2006-07-13 17:44
When shlex.split defines tokens, it doesn't properly
interpret ';', '&', and '&&'.  These should always be
place in a separate token (unless inside a string).

The shell treats the following as identical cases, but
shlex.split doesn't:

echo hi&&echo bye
echo hi && echo bye

echo hi;echo bye
echo hi ; echo bye

echo hi&echo bye
echo hi & echo bye

shlex.split makes these cases ambiguous:

echo 'foo&'
echo foo&

echo '&&exit'
echo &&exit
msg115462 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-09-03 16:27
Thanks for the report. Would you like to work on a patch, or translate your examples into unit tests?

The docs do not mention “&” at all, and platform discrepancies have to be taken into account too, so I really don’t know if this is a bug fix for the normal mode, the POSIX mode, or a feature request requiring a new argument to the shlex function to preserve compatibility.
msg115482 - (view) Author: Dan Christian (robodan) Date: 2010-09-03 18:42
It's been a while since I looked at this.  I'm not really in a
position to contribute code/tests right now; but I can comment.

I don't think POSIX mode existed when I first reported this, but
that's where it makes sense.  I think all POSIX shells (borne, C,
korne), will behave the same way for the issues mentioned.

There are really two cases in one bug.

The first part is that the shell will split tokens at characters that
shlex doesn't.  The handling of &, |, ;, >, and < could be done by
adjusting the definition of shlex.wordchars.  The shell may also
understands things like: &&, ||, |&, and >&.  The exact definition of
these depends on the shell, so maybe it's best to just split them out
as separate tokens and let the user figure out the compound meanings.

The proper handling of quotes/escapes requires some kind of new
interface.  You need to distinguish between tokens that were modified
by the quote/escape rules and those that were not.  One suggestion is
to add a new method as such:

shlex.get_token2()
   Return a tuple of the token and the original text of the token
(including quotes and escapes).  Otherwise, this is the same as
shlex.get_token().

Comparing the two values for equality (or maybe identity) would tell
you if something special was going on.  You can always pass the second
value to a reconstructed command line without losing any of the
original parsing information.

-Dan

On Fri, Sep 3, 2010 at 10:27 AM, Éric Araujo <report@bugs.python.org> wrote:
>
> Éric Araujo <merwok@netwok.org> added the comment:
>
> Thanks for the report. Would you like to work on a patch, or translate your examples into unit tests?
>
> The docs do not mention “&” at all, and platform discrepancies have to be taken into account too, so I really don’t know if this is a bug fix for the normal mode, the POSIX mode, or a feature request requiring a new argument to the shlex function to preserve compatibility.
>
> ----------
> nosy: +eric.araujo, eric.smith
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1521950>
> _______________________________________
>
msg148272 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-11-24 16:20
Thanks for the comments.

> There are really two cases in one bug.
> The first part is that the shell will split tokens at characters that shlex doesn't.  The handling
> of &, |, ;, >, and < could be done by adjusting the definition of shlex.wordchars.  The shell may
> also understands things like: &&, ||, |&, and >&.  The exact definition of these depends on the
> shell, so maybe it's best to just split them out as separate tokens and let the user figure out the
> compound meanings.
Yes.  I think that the main use of shlex is really to parse a line into chunks with a way to embed spaces; it’s intended to parse a program command line (“prog --blah "value stillthesamevalue" "arg samearg"”), but not necessarily a full shell line (with & and | and whatnot).  When people have a line containing & and |, then they need a shell to execute it, so they would not call shlex.split but just pass the full line to os.system or subprocess.Popen.  Do you remember what use cases you had when you opened this report?

> The proper handling of quotes/escapes requires some kind of new interface.  You need to distinguish
> between tokens that were modified by the quote/escape rules and those that were not.
I don’t see why I would care about quotes in the result of shlex.split.

See also #7611.
msg148298 - (view) Author: Dan Christian (robodan) Date: 2011-11-25 01:39
Of course, that's how it's used.  That's all it can do right now.

I was was splitting and combining commands (using ;, &&, and ||) and
then running the resulting (mega) one liners over ssh.  It still gets
run by a shell, but I was specifying the control flow. 0

 It's kind of like a makefile command block.  You want to be able to
specify if a failure aborts the sequence, or is ignored (&& vs ;).
Sometimes there are fallback commands (via ||).  Of course, you can
also group using ().

Once things are split properly, then understanding the shell control
characters is straight forward.  I my mind, shlex.split() should
either be as close to shell syntax as possible, or have a clear
explanation of what is different (and why).

I ended up doing my own parsing.  I'm not actually at that company
anymore, so I can't pull up the code.

I'll see if I can come up with a reference case and maybe a unittest
this weekend (that's really the only time I'll have to dig into it).

-Dan

On Thu, Nov 24, 2011 at 9:20 AM, Éric Araujo <report@bugs.python.org> wrote:
>
> Éric Araujo <merwok@netwok.org> added the comment:
>
> Thanks for the comments.
>
>> There are really two cases in one bug.
>> The first part is that the shell will split tokens at characters that shlex doesn't.  The handling
>> of &, |, ;, >, and < could be done by adjusting the definition of shlex.wordchars.  The shell may
>> also understands things like: &&, ||, |&, and >&.  The exact definition of these depends on the
>> shell, so maybe it's best to just split them out as separate tokens and let the user figure out the
>> compound meanings.
> Yes.  I think that the main use of shlex is really to parse a line into chunks with a way to embed spaces; it’s intended to parse a program command line (“prog --blah "value stillthesamevalue" "arg samearg"”), but not necessarily a full shell line (with & and | and whatnot).  When people have a line containing & and |, then they need a shell to execute it, so they would not call shlex.split but just pass the full line to os.system or subprocess.Popen.  Do you remember what use cases you had when you opened this report?
>
>> The proper handling of quotes/escapes requires some kind of new interface.  You need to distinguish
>> between tokens that were modified by the quote/escape rules and those that were not.
> I don’t see why I would care about quotes in the result of shlex.split.
>
> See also #7611.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1521950>
> _______________________________________
>
msg148338 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-11-25 17:01
> Of course, that's how it's used.  That's all it can do right now.
:) What I meant is that it is *meant* to be used in this way.

> I was was splitting and combining commands (using ;, &&, and ||) and then running the resulting
> (mega) one liners over ssh.  It still gets run by a shell, but I was specifying the control flow.
Thank you for the reply.  It is indeed a valuable use case to pass a command line as one string to ssh, and the split/quote combo should round-trip and be useful for this usage.

> I'll see if I can come up with a reference case and maybe a unittest this weekend
Great!  A new argument (with a default value which gets us the previous behavior) will probably be needed, to preserve backward compatibility.
msg148352 - (view) Author: Dan Christian (robodan) Date: 2011-11-25 19:25
I've attached a diff to test_shlex.py and a script that I used to
verify what the shells actually do.
Both are relative to Python-3.Lib/test

I'm completely ignoring the quotes issue for now.  That should
probably be an enhancement.  I don't think it really matters until the
parsing issues are resolved.

ref_shlex is python 2 syntax.  python -3 shows that it should convert cleanly.
./ref_shlex.py
It will run by default against /bin/*sh
If you don't want that, do something like: export SHELLS='/bin/sh,/bin/csh'
It runs as a unittest.  So you will only see dots if all shells do
what it expects.  Some shells are flaky (e.g. zsh, tcsh), so you may
need to run it multiple times.

Getting this into the mainline will be interesting.  I would think it
would take some community discussion.  I may be able to convince
people that the current behaviour is wrong, but I can't tell you what
will break if it is "fixed".  And should the fix be the default?  As
you mentioned, it depends on what people expect it to do and how it is
currently being used.  I see the first step as presenting a clear case
of how it should work.

-Dan

On Fri, Nov 25, 2011 at 10:01 AM, Éric Araujo <report@bugs.python.org> wrote:
>
> Éric Araujo <merwok@netwok.org> added the comment:
>
>> Of course, that's how it's used.  That's all it can do right now.
> :) What I meant is that it is *meant* to be used in this way.
>
>> I was was splitting and combining commands (using ;, &&, and ||) and then running the resulting
>> (mega) one liners over ssh.  It still gets run by a shell, but I was specifying the control flow.
> Thank you for the reply.  It is indeed a valuable use case to pass a command line as one string to ssh, and the split/quote combo should round-trip and be useful for this usage.
>
>> I'll see if I can come up with a reference case and maybe a unittest this weekend
> Great!  A new argument (with a default value which gets us the previous behavior) will probably be needed, to preserve backward compatibility.
>
> ----------
> nosy: +niemeyer
> versions: +Python 3.3 -Python 3.2
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1521950>
> _______________________________________
>
msg148360 - (view) Author: Dan Christian (robodan) Date: 2011-11-25 21:03
I just realized that I left out a major case.  The shell will also
split ().  I think this is now complete.  If you do "man bash" and
skip down to DEFINITONS it lists all the control characters.

I've attached updated versions of ref_shlex.py and test_shlex.diff.
They replace the previous ones.

-Dan

On Fri, Nov 25, 2011 at 12:25 PM, Dan Christian <report@bugs.python.org> wrote:
>
> Dan Christian <robodan@users.sourceforge.net> added the comment:
>
> I've attached a diff to test_shlex.py and a script that I used to
> verify what the shells actually do.
> Both are relative to Python-3.Lib/test
>
> I'm completely ignoring the quotes issue for now.  That should
> probably be an enhancement.  I don't think it really matters until the
> parsing issues are resolved.
>
> ref_shlex is python 2 syntax.  python -3 shows that it should convert cleanly.
> ./ref_shlex.py
> It will run by default against /bin/*sh
> If you don't want that, do something like: export SHELLS='/bin/sh,/bin/csh'
> It runs as a unittest.  So you will only see dots if all shells do
> what it expects.  Some shells are flaky (e.g. zsh, tcsh), so you may
> need to run it multiple times.
>
> Getting this into the mainline will be interesting.  I would think it
> would take some community discussion.  I may be able to convince
> people that the current behaviour is wrong, but I can't tell you what
> will break if it is "fixed".  And should the fix be the default?  As
> you mentioned, it depends on what people expect it to do and how it is
> currently being used.  I see the first step as presenting a clear case
> of how it should work.
>
> -Dan
msg148405 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-11-26 14:12
Thanks for the diff and test.  (I removed the older versions; there are “edit” links in the list of files leading to pages where it’s possible to remove them, if one has the required permissions.)

Your script passes with dash, which is probably the most POSIX-compliant shell we can find.  (bash has extensions, zsh/csh don’t use the POSIX shell language, so I think the behavior of dash should be our reference, not the bash man page.)

> I may be able to convince people that the current behaviour is wrong, but I can't tell you what will
> break if it is "fixed".  And should the fix be the default?  As you mentioned, it depends on what
> people expect it to do and how it is currently being used.

python-dev takes compatibility seriously.  Some things are clearly bugs and we fix them, even if it will break buggy code out there.  For example, we recently fixed bugs in HTML parsing: We had a specification to decide that they were really bugs, and we judged that no sane program could be relying on the exact behavior of the parser.  shlex is another case; in my opinion, it’s been used for years to implement parsing similar, but not identical in all cases, to the shell’s, and as there is code out there that depends on the current behavior of shlex and does not need to support && || ; ( ), if we add support for these tokens we should not break the existing code.  Given that we can’t test all programs that use shlex, I think we’ll have to add a new parameter, with a default value which gets us the previous behavior, as I said in my previous message.

(BTW, would you mind editing the quoted section when you reply by email?  Otherwise we get unhelpful, distracting walls of quoted texts.  Thanks in advance.)
msg148410 - (view) Author: Dan Christian (robodan) Date: 2011-11-26 14:40
On Sat, Nov 26, 2011 at 7:12 AM, Éric Araujo <report@bugs.python.org> wrote:
> Your script passes with dash, which is probably the most POSIX-compliant shell we can find.  (bash has extensions, zsh/csh don’t use the POSIX shell language, so I think the behavior of dash should be our reference, not the bash man page.)

I was just looking for a reference where I didn't have to sift through
tons of documentation.  Most systems have bash.  Before that I was
just working from experience (I've done a lot of shell scripting).

> there is code out there that depends on the current behavior of shlex and does not need to support && || ; ( ), if we add support for these tokens we should not break the existing code.

Here's a thought on how that might work (just brainstorming).  shlex
uses a series of character strings to drive it's parsing:  whitespace,
escape, quotes.  Add another one: control = '();<>|&'.  If it is unset
(by default?), then the behavior is as before.  If it is set, then
shlex will output any character in control as a separate token.

There might be a shell specific script (or maybe it's left to the
user) that decides that certain tokens can be recombined:  '&&', '||',
'|&', '>>', etc.  This code is pretty simple:  walk the token
sequence, if you see a two token pair, pop the second and combine it
into the first.

-Dan
msg148413 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011-11-26 15:25
> I was just looking for a reference where I didn't have to sift through tons of documentation.
Sure :)  That’s why I suggest using dash for quick tests and rely on the work of other people who did read the POSIX spec.  I’ll have to check it too before committing a patch.

> shlex uses a series of character strings to drive it's parsing:  whitespace, escape, quotes.
> Add another one: control = '();<>|&'.  If it is unset (by default?), then the behavior is as
> before.
So we would need to add a Shlex subclass to the module to provide the new behavior.  I think I prefer a new argument, because we can just extend the existing class and functions instead of adding subtly differing duplicates.

> If it is set, then shlex will output any character in control as a separate token.
Unless it is part of a quoted segment, right?  (See #7611 for 'foo#bar' vs. 'foo #bar').

> There might be a shell specific script (or maybe it's left to the user)
> that decides that certain tokens can be recombined:
Seems to much complexity.  I really prefer if we agree on one command parsing behavior (POSIX, i.e. dash) and improve shlex to support that.  People wanting zsh rules can write their own subclass.

> '&&', '||', '|&', '>>', etc.
Wouldn’t it be more correct to consider them different tokens?  I don’t have a format training in CS or programming, so I’m not sure that my definition is correct at all, but in my mind a token is a unit, and thus & and && are two different things.
msg148417 - (view) Author: Dan Christian (robodan) Date: 2011-11-26 17:46
> Sure :)  That’s why I suggest using dash for quick tests and rely on the work of other people who did read the POSIX spec.  I’ll have to check it too before committing a patch.

The point of ref_shlex.py is that all shells act the same for common
cases and shlex doesn't match any of them.  The only real split it
that csh based shells do some things differently that sh based shells
('2>' vs '&>').

>> shlex uses a series of character strings to drive it's parsing:  whitespace, escape, quotes.
>> Add another one: control = '();<>|&'.  If it is unset (by default?), then the behavior is as
>> before.
> So we would need to add a Shlex subclass to the module to provide the new behavior.  I think I prefer a new argument, because we can just extend the existing class and functions instead of adding subtly differing duplicates.

You don't have to do a subclass (although that might have some
advantages).  You could do something like:
def shlex(s, comments=False, posix=True, control=False):
...
  if control:
    if control is True:
      self.control = '();<>|&'
    else:
      self.control = control  # let user specify their own control set

>> If it is set, then shlex will output any character in control as a separate token.
> Unless it is part of a quoted segment, right?  (See #7611 for 'foo#bar' vs. 'foo #bar').

Correct, quotes wouldn't change.

>> There might be a shell specific script (or maybe it's left to the user)
>> that decides that certain tokens can be recombined:
> Seems to much complexity.  I really prefer if we agree on one command parsing behavior (POSIX, i.e. dash) and improve shlex to support that.  People wanting zsh rules can write their own subclass.

shlex is a pretty simple lexer (as lexers go), and I wouldn't want it
to get complicated.  It's easier in the current code structure to
split everything and then re-join as needed.  This also allows you to
select sh vs csh joining rules (e.g. '|&' means different things in sh
vs csh).  Every shell that I've seen follows one of those two flavors
for syntax.

>> '&&', '||', '|&', '>>', etc.
> Wouldn’t it be more correct to consider them different tokens?  I don’t have a format training in CS or programming, so I’m not sure that my definition is correct at all, but in my mind a token is a unit, and thus & and && are two different things.

Ideally, the final tokens have exact meanings.  It easier to write
handler code for '&&' than ('&', '&').  This is just a case of whether
the parse joins them together or it's done in a second step.  The
current code doesn't do much look ahead, so it's hard for the lexer to
produce things like '&&' directly.

-Dan
msg150761 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-01-06 18:57
I've made a patch which implements this functionality, together with docs and tests. Please review.
msg153823 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-02-21 00:10
This time you should have received an email from Rietveld, I made sure that your ID was expanded to an email address.

I like all the suggestions you made in reply to my comments.
msg153882 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-02-21 17:27
I updated the patch to reflect Éric's comments on Rietveld, but there are also some other changes:

Previously when punctuation chars were set, wordchars was being augmented by '-'. This was incomplete, so the augmentation is now with '~-./*?=' which allows for wildcards, filename chars and argument flags.

I added a token_type attribute whose value is 'a' for alphanumeric tokens and 'c' for punctuation tokens. This token type is internally tracked anyway - we just expose it now. It is needed for when multiple punctuation tokens need to be disambiguated, because we might return two logically separate punctuation tokens as one if they are not separated by whitespace in the source being tokenised.

New attributes and the changes to wordchars have been documented, and a test added for token_type return values.
msg153883 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-02-21 17:37
Plus I also changed a few instances of the anachronism

a = a + b

to

a += b
msg154031 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-02-23 01:33
> Previously when punctuation chars were set, wordchars was being augmented by '-'. This was
> incomplete, so the augmentation is now with '~-./*?=' which allows for wildcards, filename
> chars and argument flags.
I did not fully get what you meant here, but the example you added to the doc made it clear.  Is this covered by tests?

Overall great patch!  Dan, do you have time to test it (or read the new examples in the patch) to tell us if it meets what you wanted?
msg154056 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-02-23 10:08
>Éric Araujo <merwok@netwok.org> added the comment:

>I did not fully get what you meant here, but the example you added to the doc made it clear.  Is this covered by tests?

Yes, I believe that testSyntaxSplitCustom covers this.

>Overall great patch!  Dan, do you have time to test it (or read the new examples in the patch) to tell us if it meets what you wanted?

Thanks! It was a bit fiddly, shlex is somewhat difficult to extend cleanly. I developed this functionality for a subprocess ease-of-use-wrapper module called sarge, and I had to basically copy and modify the whole read_token method :-(
msg154064 - (view) Author: Dan Christian (robodan) Date: 2012-02-23 14:47
I haven't been following this much.  Sorry.  My day job isn't in this area any more (and I'm stuck using 2.4 :-().

Looking at the docs, I notice the "old" is different from what it used to be.  Notably: 'e;' gets split into two tokens; and ">'abc';" gets split into 3.  I'm pretty sure that baseline code doesn't split those at all.  So there is a question of if "old" is fully backward compatible.

The "new" functionality looks great.  That's what I was looking for when I filed the bug.

Thank you!
-Dan
msg158932 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-04-21 23:04
I've received no comments on the latest revision of my patch (incorporating comments on the previous version); is it OK to commit this?
msg158934 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-04-21 23:15
I'd like to take a look at this (I wasn't aware of it before).  I'll try to do that some time in the next 24 hours, and if I don't you shouldn't wait for me :)

Did you address Dan's concern about 'old' possibly not matching the old behavior completely?
msg158956 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-04-22 11:50
I believe Dan meant that the behaviour of shlex.split() now is different from what it was when he first raised the issue (in July 2006). Looking at the default branch of CPython, this is what I see:

Python 3.3.0a2+ (default:ff6593aa8376, Apr 22 2012, 12:39:08) 
[GCC 4.3.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import shlex
>>> list(shlex.shlex('e;'))
['e', ';']
>>> list(shlex.shlex(">'abc';"))
['>', "'abc'", ';']

Likewise, on the 2.6 branch:

Python 2.6.8+ (unknown, Apr 22 2012, 12:44:43) 
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import shlex
>>> list(shlex.shlex('e;'))
['e', ';']
>>> list(shlex.shlex(">'abc';"))
['>', "'abc'", ';']

So from what Dan is saying, it would seem that he is saying that shlex behaviour (before my patch being applied) is different now to how he remembers it - not that the patch introduces any incompatibility.

Still, another set of eyeballs on the patch would be good.
msg159016 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-04-23 13:48
> I'd like to take a look at this (I wasn't aware of it before).
Are you interested in shlex in general or only this bug?  If the former, then I’ll try to remember to make you nosy on future issues.

BTW, what is the shlex unicode bug you mentioned a few times on Rietveld?  The one I know is fixed now.
msg159019 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-04-23 14:04
I am interested in shell stuff in general.

The unicode bug is issue 1170.
msg162157 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2012-06-02 17:43
I've updated the patch following comments by RDM - it probably could do with a code review (now that I've addressed RDM's comments on the docs).
msg162170 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-06-02 20:01
Review, including a code-but-not-algorithm review :), posted.
msg207023 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2013-12-28 08:33
Let's hope we can get this into 3.5. I updated my patch a while ago to address RDM's comments.
msg266823 - (view) Author: Andrey Kislyuk (Andrey.Kislyuk) * Date: 2016-06-01 16:35
Is there any chance of getting this into 3.6? We are still in a situation where the shlex module misleads developers into believing that it has functionality to parse things the way the shell does. I've had to vendor the copy of shlex with patches from this bug applied (thanks Vinay!)
msg270949 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2016-07-21 17:41
This has been knocking around since 3.3, but never got enough attention to make it in. Barring objections from anyone, I'd like to commit this patch once I check that it applies cleanly against 3.6, before we get into 3.6 beta.
msg270952 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-07-21 18:12
No objection from me.  I'm not likely to have the time to give it the kind of thorough review I'd *like* to, but I don't think it is really needed.
msg271456 - (view) Author: Vinay Sajip (vinay.sajip) * (Python committer) Date: 2016-07-27 14:31
Okay, I've updated with a new patch addressing SilentGhost's comments, and addressed the comments on that patch. If I don't hear any objections by Friday, I plan to commit this change.
msg271651 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-07-29 21:35
New changeset ea99e2f0b829 by Vinay Sajip in branch 'default':
Closes #1521950: Made shlex parsing more shell-like.
https://hg.python.org/cpython/rev/ea99e2f0b829
History
Date User Action Args
2022-04-11 14:56:18adminsetgithub: 43667
2019-01-04 17:25:03cheryl.sabellalinkissue19120 superseder
2016-07-29 21:35:50python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg271651

resolution: fixed
stage: patch review -> resolved
2016-07-27 14:33:00vinay.sajipsetassignee: vinay.sajip
2016-07-27 14:31:28vinay.sajipsetmessages: + msg271456
2016-07-22 11:02:15vinay.sajipsetfiles: + refresh-2016.diff
2016-07-21 18:12:36r.david.murraysetmessages: + msg270952
2016-07-21 17:41:13vinay.sajipsetmessages: + msg270949
versions: + Python 3.6, - Python 3.5
2016-06-01 16:35:40Andrey.Kislyuksetnosy: + Andrey.Kislyuk
messages: + msg266823
2014-10-01 22:30:04vinay.sajipsetfiles: + incorporating-issue-21999.diff
2014-05-16 05:51:44cvrebertsetnosy: + cvrebert
2013-12-28 08:33:20vinay.sajipsetmessages: + msg207023
versions: + Python 3.5, - Python 3.4
2012-08-01 22:24:04vinay.sajipsetversions: + Python 3.4, - Python 3.3
2012-06-03 17:50:37vinay.sajipsetfiles: + changes-after-yet-more-feedback.diff
2012-06-02 20:01:10r.david.murraysetmessages: + msg162170
2012-06-02 17:43:58vinay.sajipsetmessages: + msg162157
2012-04-27 07:03:39ezio.melottisetnosy: + ezio.melotti
2012-04-25 15:53:48vinay.sajipsetfiles: + changes-after-more-feedback.diff
2012-04-23 14:04:36r.david.murraysetmessages: + msg159019
2012-04-23 13:48:34eric.araujosetmessages: + msg159016
2012-04-22 11:50:45vinay.sajipsetmessages: + msg158956
2012-04-21 23:15:08r.david.murraysetnosy: + r.david.murray
messages: + msg158934
2012-04-21 23:04:31vinay.sajipsetmessages: + msg158932
2012-02-23 14:47:35robodansetmessages: + msg154064
2012-02-23 10:08:33vinay.sajipsetmessages: + msg154056
2012-02-23 01:40:55niemeyersetnosy: - niemeyer
2012-02-23 01:33:43eric.araujosetmessages: + msg154031
2012-02-21 17:37:03vinay.sajipsetmessages: + msg153883
2012-02-21 17:27:35vinay.sajipsetmessages: + msg153882
2012-02-21 17:16:55vinay.sajipsetfiles: + changes-after-feedback.diff
2012-02-21 00:10:25eric.araujosetmessages: + msg153823
2012-01-06 18:57:51vinay.sajipsetfiles: + changes-tests-docs.diff
2012-01-06 18:57:03vinay.sajipsetnosy: + vinay.sajip
messages: + msg150761

hgrepos: + hgrepo99
stage: test needed -> patch review
2011-11-26 17:46:37robodansetmessages: + msg148417
2011-11-26 15:25:00eric.araujosetmessages: + msg148413
2011-11-26 14:40:23robodansetmessages: + msg148410
2011-11-26 14:12:54eric.araujosetmessages: + msg148405
2011-11-26 13:56:27eric.araujosetfiles: - test_shlex.diff
2011-11-26 13:56:25eric.araujosetfiles: - ref_shlex.py
2011-11-25 21:03:38robodansetfiles: + ref_shlex.py, test_shlex.diff

messages: + msg148360
2011-11-25 19:25:34robodansetfiles: + ref_shlex.py, test_shlex.diff
keywords: + patch
messages: + msg148352
2011-11-25 17:01:19eric.araujosetnosy: + niemeyer

messages: + msg148338
versions: + Python 3.3, - Python 3.2
2011-11-25 01:39:40robodansetmessages: + msg148298
2011-11-24 16:20:41eric.araujosetmessages: + msg148272
2010-10-15 16:13:04georg.brandlunlinkissue1699594 dependencies
2010-09-03 18:42:27robodansetmessages: + msg115482
2010-09-03 16:27:33eric.araujosetnosy: + eric.araujo, eric.smith
messages: + msg115462
2010-07-10 09:42:54BreamoreBoysetversions: + Python 3.2, - Python 2.7
2009-03-30 17:04:08ajaksu2linkissue1699594 dependencies
2009-03-30 05:06:57ajaksu2setstage: test needed
type: enhancement
versions: + Python 2.7, - Python 2.3
2006-07-13 17:44:33robodancreate