This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Unnecessary complexity in tokenize.py: comments and newlines
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Albert-Jan Nijburg, meador.inge, serhiy.storchaka, terry.reedy
Priority: normal Keywords:

Created on 2017-05-16 13:08 by Albert-Jan Nijburg, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 1607 merged python-dev, 2017-05-16 13:21
Messages (9)
msg293760 - (view) Author: Albert-Jan Nijburg (Albert-Jan Nijburg) * Date: 2017-05-16 13:08
While porting tokenize.py to javascript I stumbled upon this. The bit of code that checks if it's a newline or a comment, checks for comment twice. These can be split up, this way the code is a bit more readable. 

https://github.com/python/cpython/blob/master/Lib/tokenize.py#L560

It's not broken, it's just a bit more complex then it has to be.
msg293770 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-16 16:08
The code is correct, it just can be made cleaner. No need to backport the change to other versions.

The line "if line[pos] in '#\r\n':" looks a kind of optimization. In common case (not a newline and not a comment) there is only one check. The expression "(NL, COMMENT)[line[pos] == '#']" is redundant of course, it can be replaced by just "NL". Note that now the line that yields yield TokenInfo(NL, ...) is almost the same for comments and newlines. If rename nl_pos to pos the same line can be used in both cases.
msg293772 - (view) Author: Albert-Jan Nijburg (Albert-Jan Nijburg) * Date: 2017-05-16 16:27
Oh yes you're right! I've updated the code on github. Even cleaner this way :).
msg293841 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-17 06:04
This is simple change, and I would write the same code, but just for the case could you please sign CLA?
msg293844 - (view) Author: Albert-Jan Nijburg (Albert-Jan Nijburg) * Date: 2017-05-17 06:23
I did yesterday, should be coming through today right?
msg293845 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-17 06:35
Don't worry, CLA is accepted manually, and this takes some time.
msg293952 - (view) Author: Albert-Jan Nijburg (Albert-Jan Nijburg) * Date: 2017-05-19 08:14
Still no CLA, I checked my username on the pdf, and it's correct, hope someone looks at it soon :)
msg293973 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-05-19 20:17
This is pycon week, so staff person is busy with that.
msg294346 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-24 11:31
New changeset c471ca448cf336d7eb4a7cbe14d0012baf122d1f by Serhiy Storchaka (Albert-Jan Nijburg) in branch 'master':
bpo-30377: Simplify handling of COMMENT and NL in tokenize.py (#1607)
https://github.com/python/cpython/commit/c471ca448cf336d7eb4a7cbe14d0012baf122d1f
History
Date User Action Args
2022-04-11 14:58:46adminsetgithub: 74562
2017-05-24 11:54:15serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017-05-24 11:31:59serhiy.storchakasetmessages: + msg294346
2017-05-19 20:17:53terry.reedysetnosy: + terry.reedy

messages: + msg293973
title: Unnecessary complexity in tokenize.py around handling of comments and newlines -> Unnecessary complexity in tokenize.py: comments and newlines
2017-05-19 08:14:19Albert-Jan Nijburgsetmessages: + msg293952
2017-05-17 06:35:25serhiy.storchakasetmessages: + msg293845
2017-05-17 06:23:52Albert-Jan Nijburgsetmessages: + msg293844
2017-05-17 06:04:20serhiy.storchakasetmessages: + msg293841
2017-05-16 16:27:06Albert-Jan Nijburgsetmessages: + msg293772
2017-05-16 16:08:31serhiy.storchakasetassignee: serhiy.storchaka
stage: patch review
2017-05-16 16:08:10serhiy.storchakasetnosy: + serhiy.storchaka

messages: + msg293770
versions: - Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6
2017-05-16 13:21:15python-devsetpull_requests: + pull_request1697
2017-05-16 13:08:09Albert-Jan Nijburgcreate