This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author evan_
Recipients Andrey.Kislyuk, Gustavo Goretkin, cvrebert, eric.araujo, eric.smith, evan_, ezio.melotti, ned.deily, python-dev, r.david.murray, robodan, vinay.sajip
Date 2017-01-21.02:13:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1484964810.47.0.215851768915.issue28595@psf.upfronthosting.co.za>
In-reply-to
Content
Unfortunately shlex.shlex's defaults are probably going to remain that way for a long time in order to avoid breaking backwards compatibility. Presumably shlex.split was added so you didn't have to remember to set posix and whitespace_split yourself.

The particular problem I'm addressing in this issue is that the new punctuation_chars argument doesn't currently work with whitespace_split.

>>> def split(text, ws=False, pc=False):
...     s = shlex.shlex(text, posix=True, punctuation_chars=pc)
...     s.whitespace_split = ws
...     return list(s)
...
>>> split('foo,bar>baz')
['foo', ',', 'bar', '>', 'baz']
>>> split('foo,bar>baz', ws=True)
['foo,bar>baz']
>>> split('foo,bar>baz', pc=True)
['foo', ',', 'bar', '>', 'baz']
>>> split('foo,bar>baz', ws=True, pc=True)
['foo,bar>baz']

With my patch, the last example outputs ['foo,bar', '>', 'baz'].

Before the release of 3.6 I was arguing that punctuation_chars should not attempt to augment wordchars at all, since the idea of wordchars is inherently incorrect as you point out. Now I think it's too late to change, hence my patch treats this as a new feature in 3.7.
History
Date User Action Args
2017-01-21 02:13:30evan_setrecipients: + evan_, vinay.sajip, eric.smith, robodan, ned.deily, ezio.melotti, eric.araujo, r.david.murray, cvrebert, python-dev, Andrey.Kislyuk, Gustavo Goretkin
2017-01-21 02:13:30evan_setmessageid: <1484964810.47.0.215851768915.issue28595@psf.upfronthosting.co.za>
2017-01-21 02:13:30evan_linkissue28595 messages
2017-01-21 02:13:29evan_create