shlex bug?
Created on 2012-10-10 13:47 by jwfang, last changed 2012-10-15 02:57 by roger.serwy. This issue is now closed.

Author: jamesf (jwfang) Date: 2012-10-10 13:47
In [112]: def test_ws(s):
   .....:     sl = shlex.shlex(s)
   .....:     sl.whitespace = "|"
   .....:     sl.whitespace_split = True
   .....:     print list(sl)

In [114]: test_ws("h w")   # works fine
['h w']

In [115]: test_ws("'h' w")  # i expected ["'h' w"] here, but why?
["'h'", ' w']
Author: jamesf (jwfang) Date: 2012-10-10 13:51
but if i did this, it works again:

In [121]: test_ws(" 'h' w")  # prepend a whitespace at the beginning
[" 'h' w"]
Author: Roger Serwy (roger.serwy) Date: 2012-10-10 16:43
I verified that the problem also occurs with 3.3 and 3.4 as well.

Adding "sl.posix = True" causes the "h w" test to enter an infinite loop.
Author: Roger Serwy (roger.serwy) Date: 2012-10-11 19:23
The .posix = True bug is a separate issue, now in #16200.
Author: Roger Serwy (roger.serwy) Date: 2012-10-15 02:57
Upon further reading of the non-POSIX mode of shlex, this behavior is not a bug. See

The "'h' w" test case parses correctly according to:
* Closing quotes separate words ("Do"Separate is parsed as "Do" and Separate);

The " 'h' w" test case parses correctly, since the quote is now within a word. The literal whitespace at the beginning is not recognized as whitespace, as it is not a "|". This follows the rule:
* Quote characters are not recognized within words (Do"Not"Separate is parsed as the single word Do"Not"Separate);

I'm closing this issue as invalid. Feel free to reopen if there is an error in my analysis.
