This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: shlex bug?
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, jwfang, roger.serwy
Priority: normal Keywords:

Created on 2012-10-10 13:47 by jwfang, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (5)
msg172573 - (view) Author: jamesf (jwfang) Date: 2012-10-10 13:47
In [112]: def test_ws(s):
   .....:     sl = shlex.shlex(s)
   .....:     sl.whitespace = "|"
   .....:     sl.whitespace_split = True
   .....:     print list(sl)


In [114]: test_ws("h w")   # works fine
['h w']

In [115]: test_ws("'h' w")  # i expected ["'h' w"] here, but why?
["'h'", ' w']
msg172574 - (view) Author: jamesf (jwfang) Date: 2012-10-10 13:51
but if i did this, it works again:

In [121]: test_ws(" 'h' w")  # prepend a whitespace at the beginning
[" 'h' w"]
msg172594 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2012-10-10 16:43
I verified that the problem also occurs with 3.3 and 3.4 as well.

Adding "sl.posix = True" causes the "h w" test to enter an infinite loop.
msg172681 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2012-10-11 19:23
The .posix = True bug is a separate issue, now in #16200.
msg172942 - (view) Author: Roger Serwy (roger.serwy) * (Python committer) Date: 2012-10-15 02:57
Upon further reading of the non-POSIX mode of shlex, this behavior is not a bug. See http://docs.python.org/py3k/library/shlex.html?highlight=shlex#parsing-rules

The "'h' w" test case parses correctly according to:
* Closing quotes separate words ("Do"Separate is parsed as "Do" and Separate);

The " 'h' w" test case parses correctly, since the quote is now within a word. The literal whitespace at the beginning is not recognized as whitespace, as it is not a "|". This follows the rule:
* Quote characters are not recognized within words (Do"Not"Separate is parsed as the single word Do"Not"Separate);


I'm closing this issue as invalid. Feel free to reopen if there is an error in my analysis.
History
Date User Action Args
2022-04-11 14:57:37adminsetgithub: 60390
2012-10-15 02:57:59roger.serwysetstatus: open -> closed
resolution: not a bug
2012-10-15 02:57:44roger.serwysetmessages: + msg172942
2012-10-11 19:23:10roger.serwysetmessages: + msg172681
2012-10-11 11:49:13ezio.melottisetnosy: + ezio.melotti

components: + Library (Lib)
stage: needs patch
2012-10-10 16:43:09roger.serwysetnosy: + roger.serwy

messages: + msg172594
versions: + Python 3.3, Python 3.4
2012-10-10 13:51:02jwfangsetmessages: + msg172574
2012-10-10 13:47:04jwfangcreate