classification
Title: shlex.shlex.error_leader() reports incorrect line number
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, birknilson, ezio.melotti, hoadlck, python-dev, r.david.murray, zmedico
Priority: normal Keywords: patch

Created on 2012-10-03 20:10 by Arfrever, last changed 2016-12-28 09:31 by petri.lehtinen.

Files
File name Uploaded Description Edit
shlex_test.py Arfrever, 2012-10-03 20:10
issue16121.patch birknilson, 2013-02-23 20:54 review
Messages (12)
msg171905 - (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * (Python triager) Date: 2012-10-03 20:10
shlex.shlex.error_leader() reports incorrect line number with posix=True or when last token is not quoted.
This bug occurs in all versions of Python.

The attached script shows this bug:
$ ./shlex_test.py
### text1 posix=False
('var1', '"None", line 1: ')
('=', '"None", line 1: ')
('"x"', '"None", line 1: ')
('var2', '"None", line 2: ')
('=', '"None", line 2: ')
('"y"', '"None", line 2: ')
('var3', '"None", line 3: ')
('=', '"None", line 3: ')
('"z"', '"None", line 3: ')
### text1 posix=True
('var1', '"None", line 1: ')
('=', '"None", line 1: ')
('x', '"None", line 2: ')
('var2', '"None", line 2: ')
('=', '"None", line 2: ')
('y', '"None", line 3: ')
('var3', '"None", line 3: ')
('=', '"None", line 3: ')
('z', '"None", line 3: ')
### text2 posix=False
('var1', '"None", line 1: ')
('=', '"None", line 1: ')
('x', '"None", line 2: ')
('var2', '"None", line 2: ')
('=', '"None", line 2: ')
('y', '"None", line 3: ')
('var3', '"None", line 3: ')
('=', '"None", line 3: ')
('z', '"None", line 3: ')
### text2 posix=True
('var1', '"None", line 1: ')
('=', '"None", line 1: ')
('x', '"None", line 2: ')
('var2', '"None", line 2: ')
('=', '"None", line 2: ')
('y', '"None", line 3: ')
('var3', '"None", line 3: ')
('=', '"None", line 3: ')
('z', '"None", line 3: ')
### text3 posix=False
('"x"', '"None", line 1: ')
('"y"', '"None", line 2: ')
('"z"', '"None", line 3: ')
### text3 posix=True
('x', '"None", line 2: ')
('y', '"None", line 3: ')
('z', '"None", line 3: ')
### text4 posix=False
('x', '"None", line 2: ')
('y', '"None", line 3: ')
('z', '"None", line 3: ')
### text4 posix=True
('x', '"None", line 2: ')
('y', '"None", line 3: ')
('z', '"None", line 3: ')

Only "text1 posix=False" and "text3 posix=False" have all correct line numbers.
msg182820 - (view) Author: Birk Nilson (birknilson) * Date: 2013-02-23 20:54
The implementation incremented the line number immediately when a newline was detected, even before the token had been processed completely - causing the issue Arfrever posted.

This also caused the unexpected behavior of a tokens line number including the amount of lines within the token itself. In other words '"a \n b \n c" \n d' would result in the token "a \n b \n c" to have the line number #3, followed by "d" being on the expected line #4. In my patch, the expected behavior of seeing the first token on line #1 and the second on #4 is introduced.

I also added the testLineNumbers method in the test suite - included in the patch.
msg182823 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-02-23 21:15
New changeset e54ee8d2c16b by Petri Lehtinen in branch '2.7':
Issue #16121: Fix line number accounting in shlex
http://hg.python.org/cpython/rev/e54ee8d2c16b

New changeset f1d19fdb254f by Petri Lehtinen in branch '3.2':
Issue #16121: Fix line number accounting in shlex
http://hg.python.org/cpython/rev/f1d19fdb254f

New changeset 560e53fcf2b0 by Petri Lehtinen in branch '3.3':
Issue #16121: Fix line number accounting in shlex
http://hg.python.org/cpython/rev/560e53fcf2b0

New changeset f48c3c7a3205 by Petri Lehtinen in branch 'default':
Issue #16121: Fix line number accounting in shlex
http://hg.python.org/cpython/rev/f48c3c7a3205
msg182824 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2013-02-23 21:16
Applied, thanks!
msg182826 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-02-23 21:58
Something in this change seems to have broken netrc:

http://buildbot.python.org/all/builders/x86%20OpenIndiana%203.3/builds/520
msg182827 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013-02-23 22:15
New changeset 34f759fa5484 by Petri Lehtinen in branch '2.7':
Revert "Issue #16121: Fix line number accounting in shlex"
http://hg.python.org/cpython/rev/34f759fa5484

New changeset cda4a9dc415a by Petri Lehtinen in branch '3.2':
Revert "Issue #16121: Fix line number accounting in shlex"
http://hg.python.org/cpython/rev/cda4a9dc415a

New changeset 15f3fd6070b7 by Petri Lehtinen in branch '3.3':
Revert "Issue #16121: Fix line number accounting in shlex"
http://hg.python.org/cpython/rev/15f3fd6070b7

New changeset 1b0a6c1f8a08 by Petri Lehtinen in branch 'default':
Revert "Issue #16121: Fix line number accounting in shlex"
http://hg.python.org/cpython/rev/1b0a6c1f8a08
msg182828 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2013-02-23 22:16
Reverted the patch because of the broken netrc tests. This has to be investigated further.
msg182833 - (view) Author: Birk Nilson (birknilson) * Date: 2013-02-23 23:30
Sorry about that. Rookie mistake. I'm investigating it now and will get back with a revised & fully tested patch.
msg182857 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2013-02-24 08:23
> I'm investigating it now and will get back with a revised &
> fully tested patch.

Sounds good. FWIW, it only broke on 3.x and default branches, and you can probably reproduce it on my own machine, too, by applying the patch and then running test_netrc.

This is what shlex documentation says about lineno:

> shlex.lineno
>     Source line number (count of newlines seen so far plus one).

This is a bit vague, as it doesn't really state whether newlines should be treated "greedily" (consumbed before the next token is read) or "lazily" (consumed only when the next token is read).
msg182906 - (view) Author: Birk Nilson (birknilson) * Date: 2013-02-24 23:55
After investigating the issue I have a couple of proposals.

Although a bit vague, the documentation of shlex.lineo seems to suggest that it should be incremented immediately on finding a newline character. Changing this to allow wrapped lines within a token without incrementing the line number changes the existing shlex API. Something I believe should be avoided since netrc relies on the existing behavior and third-party modules might too.

Instead I recommend the following steps.
Step #1: Fix the immediate issue of getting different line numbers for the same input depending on whether posix=(True|False), but keep the current - greedy - behavior of shlex.lineo.
Step #2: A separate patch introduces shlex.wrapped_lineo which does not increment the lineno immediately, but prior to reading the next token - as introduced in my previous patch.

Step #2 should arguably be introduced in a separate issue - if at all - since it is a new feature to the shlex API.

I will provide a patch for #1 within the next day or two along with one for #2 if you guys think it is a good idea.
msg182907 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-02-25 00:04
Is it necessarily a bug if the behavior is different with posix=True or False?  If I understand correctly, non-posix mode is a backward compatibility mode, and really nobody "should" be using non-posix mode any more :)
msg227697 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-09-27 16:33
I think I'll leave it up to whoever works on this whether they want to tackle making posix mode and non-posix mode return the same values or turn this into an enhancement ticket for the proposed wrapped_lineno.  Or, if no one is interested, we can just close this.  (Or, third option, turn it into a doc issue and document exactly what shlex actually does with lineno).
History
Date User Action Args
2016-12-28 09:31:40petri.lehtinensetnosy: - petri.lehtinen
2016-12-27 13:21:48hoadlcksetnosy: + hoadlck

versions: + Python 3.6
2014-09-27 16:33:29r.david.murraysetstage: commit review -> needs patch
messages: + msg227697
versions: + Python 3.5, - Python 3.2, Python 3.3
2013-02-25 00:04:34r.david.murraysetmessages: + msg182907
2013-02-24 23:55:32birknilsonsetmessages: + msg182906
2013-02-24 08:23:50petri.lehtinensetmessages: + msg182857
2013-02-23 23:30:37birknilsonsetmessages: + msg182833
2013-02-23 22:16:51petri.lehtinensetresolution: fixed ->
messages: + msg182828
stage: resolved -> commit review
2013-02-23 22:15:11python-devsetmessages: + msg182827
2013-02-23 21:58:49r.david.murraysetstatus: closed -> open
nosy: + r.david.murray
messages: + msg182826

2013-02-23 21:16:40petri.lehtinensetstatus: open -> closed

nosy: + petri.lehtinen
messages: + msg182824

resolution: fixed
stage: needs patch -> resolved
2013-02-23 21:15:42python-devsetnosy: + python-dev
messages: + msg182823
2013-02-23 20:54:12birknilsonsetfiles: + issue16121.patch

nosy: + birknilson
messages: + msg182820

keywords: + patch
2012-11-09 14:42:05ezio.melottisetnosy: + ezio.melotti

type: behavior
stage: needs patch
2012-10-03 20:10:18Arfrevercreate