This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author takluyver
Recipients takluyver
Date 2013-01-28.11:14:28
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1359371668.59.0.445577508897.issue17061@psf.upfronthosting.co.za>
In-reply-to
Content
The docs describe the NL token as "Token value used to indicate a non-terminating newline. The NEWLINE token indicates the end of a logical line of Python code; NL tokens are generated when a logical line of code is continued over multiple physical lines."

However, after a comment or a blank line, tokenize emits NL, even when it's not inside a multi-line statement. For example:

In [15]: for tok in tokenize.generate_tokens(StringIO('#comment\n').readline):  print(tok)
TokenInfo(type=54 (COMMENT), string='#comment', start=(1, 0), end=(1, 8), line='#comment\n')
TokenInfo(type=55 (NL), string='\n', start=(1, 8), end=(1, 9), line='#comment\n')
TokenInfo(type=0 (ENDMARKER), string='', start=(2, 0), end=(2, 0), line='')

This makes it difficult to use tokenize to detect multi-line statements, as we want to do in IPython.

In my tests so far, changing two instances of NL to NEWLINE in this block (lines 530 & 533) makes it behave as I expect:
http://hg.python.org/cpython/file/a375c3d88c7e/Lib/tokenize.py#l524
History
Date User Action Args
2013-01-28 11:14:28takluyversetrecipients: + takluyver
2013-01-28 11:14:28takluyversetmessageid: <1359371668.59.0.445577508897.issue17061@psf.upfronthosting.co.za>
2013-01-28 11:14:28takluyverlinkissue17061 messages
2013-01-28 11:14:28takluyvercreate