This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: In the re's token example OP and SKIP regexes can be improved
Type: performance Stage: resolved
Components: Documentation, Regular Expressions Versions: Python 3.4, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: docs@python, ezio.melotti, mrabarnett, py.user, python-dev, rhettinger
Priority: low Keywords: patch

Created on 2014-07-14 07:23 by py.user, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
re_ex_tok.diff py.user, 2014-07-14 07:23 clear OP and fast SKIP review
Messages (3)
msg223000 - (view) Author: py.user (py.user) * Date: 2014-07-14 07:23
https://docs.python.org/3/library/re.html#writing-a-tokenizer

There are redundant escapes in the regex:

('OP',      r'[+*\/\-]'),    # Arithmetic operators

Sequence -+*/ is sufficient.

It makes the loop to do all steps on every 4 spaces:

('SKIP',    r'[ \t]'),       # Skip over spaces and tabs

Sequence [ \t]+ is faster.


Applied patch.
msg223003 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014-07-14 08:38
I will keep the \- because the - at the front of the character range is a non-obvious special case.  The other changes look reasonable.
msg223004 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-07-14 08:52
New changeset bb28542af060 by Raymond Hettinger in branch '3.4':
Issue 21977:  Minor improvements to the regexes in the tokenizer example.
http://hg.python.org/cpython/rev/bb28542af060
History
Date User Action Args
2022-04-11 14:58:05adminsetgithub: 66176
2014-07-14 15:23:42berker.peksagsetstage: commit review -> resolved
2014-07-14 08:53:36rhettingersetstatus: open -> closed
resolution: fixed
versions: - Python 2.7
2014-07-14 08:52:51python-devsetnosy: + python-dev
messages: + msg223004
2014-07-14 08:38:21rhettingersetpriority: normal -> low
versions: + Python 2.7, Python 3.4
messages: + msg223003

type: enhancement -> performance
stage: commit review
2014-07-14 08:32:21rhettingersetassignee: docs@python -> rhettinger

nosy: + rhettinger
2014-07-14 07:23:21py.usercreate