Author csernazs
Recipients csernazs, docs@python
Date 2018-11-22.21:25:02
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1542921902.22.0.788709270274.issue35297@psf.upfronthosting.co.za>
In-reply-to
Content
untokenize documentation (https://docs.python.org/3/library/tokenize.html#tokenize.untokenize) states the following:

"""
Converts tokens back into Python source code. The iterable must return sequences with at least two elements, the token type and the token string. Any additional sequence elements are ignored.
"""

This last sentence is clearly not true because here:
https://github.com/python/cpython/blob/master/Lib/tokenize.py#L242

The code checks for the length of the input token there, and the code behaves differently, in terms of whitespace, when an iterator of 2-tuples are given and when there are more elements in the tuple. When there are more elements in the tuple, the function renders whitespaces as the same as they were present in the original source.

So this code:
tokenize.untokenize(tokenize.tokenize(source.readline))

And this:
tokenize.untokenize([x[:2] for x in tokenize.tokenize(source.readline)])

Have different results.

I don't know that it is a documentation issue  or a bug in the module itself, so I created this bugreport to seek for assistance in this regard.
History
Date User Action Args
2018-11-22 21:25:02csernazssetrecipients: + csernazs, docs@python
2018-11-22 21:25:02csernazssetmessageid: <1542921902.22.0.788709270274.issue35297@psf.upfronthosting.co.za>
2018-11-22 21:25:02csernazslinkissue35297 messages
2018-11-22 21:25:02csernazscreate