Issue 19120: shlex.shlex.lineno reports a different number depending on the previous token

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/63319

classification

Title:	shlex.shlex.lineno reports a different number depending on the previous token
Type:	behavior	Stage:	resolved
Components:		Versions:	Python 3.6, Python 3.2, Python 3.3, Python 2.7, Python 2.6

process

Status:	closed	Resolution:	duplicate
Dependencies:		Superseder:	shlex.split() does not tokenize like the shell View: 1521950
Assigned To:		Nosy List:	cheryl.sabella, daniel-s, hoadlck
Priority:	normal	Keywords:

Created on 2013-09-29 02:35 by daniel-s, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
shlex_line.py	daniel-s, 2013-09-29 02:35	The code example from the comment.

Messages (3)
msg198561 - (view)	Author: Daniel (daniel-s)	Date: 2013-09-29 02:35
See the example below (also attached). First example: The lineno reported just after "word2" is pulled is 2. Second example: The lineno reported just after "," is pulled is still 1. This behaviour seems inconsistent. The lineno should increment either when the last token of a line is pulled, or after the first token from the next line (in my opinion preferably the former). It should not have different bahaviour depending on what type of token that is (alpha vs. comma). I have repeated this on Also, does Issue 16121 relate to this? #!/usr/bin/env python import shlex first = shlex.shlex("word1 word2\nword3") print (first.get_token()) print (first.get_token()) print ("line no", first.lineno) print ("") second = shlex.shlex("word1 word2,\nword3") print (second.get_token()) print (second.get_token()) print (second.get_token()) print ("line no", second.lineno) # Output: # word1 # word2 # line no 2 # # word1 # word2 # , # line no 1
msg198562 - (view)	Author: Daniel (daniel-s)	Date: 2013-09-29 02:38
From the unfinished sentence: I have repeated this on all versions of shlex on which I have tried. Including Python 2.6, 2.7, 3.2 and 3.3.
msg332986 - (view)	Author: Cheryl Sabella (cheryl.sabella) *	Date: 2019-01-04 17:25
There was a parameter `punctuation_chars` added to the shlex.shlex class with issue 1521950 (implemented for 3.6). Although the comma is not one of the default punctuation characters (setting the parameter to punctuation_chars=True won't change the behavior), you can use `punctuation_chars=","` to see the results reported in this issue. >>> second = shlex.shlex('word1 word2,\nword3', punctuation_chars=',') >>> second.get_token() 'word1' >>> second.lineno 1 >>> second.get_token() 'word2' >>> second.lineno 1 >>> second.get_token() ',' >>> second.lineno 2 >>> Closing this as a duplicate of #1521950.

History
Date	User	Action	Args
2022-04-11 14:57:51	admin	set	github: 63319
2019-01-04 17:25:03	cheryl.sabella	set	status: open -> closed superseder: shlex.split() does not tokenize like the shell nosy: + cheryl.sabella messages: + msg332986 resolution: duplicate stage: resolved
2016-12-27 13:26:30	hoadlck	set	nosy: + hoadlck versions: + Python 3.6
2013-09-29 02:38:09	daniel-s	set	messages: + msg198562
2013-09-29 02:35:13	daniel-s	create