Message 341028 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	berker.peksag
Recipients	Pod, berker.peksag, bignose, docs@python, r.david.murray, vstinner
Date	2019-04-28.13:07:55
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1556456876.01.0.873612598293.issue23297@roundup.psfhosted.org>
In-reply-to

Content
The original problem has already been solved by making tokenize.generate_tokens() public in issue 12486. However, the same exception can be raised when tokenize.open() is used with tokenize.tokenize(), because it returns a text stream: https://github.com/python/cpython/blob/da63b321f63b697f75e7ab2f88f55d907f56c187/Lib/tokenize.py#L396 hello.py -------- def say_hello(): print("Hello, World!") say_hello() text.py ------- import tokenize with tokenize.open('hello.py') as f: token_gen = tokenize.tokenize(f.readline) for token in token_gen: print(token) When we pass f.readline to tokenize.tokenize(), the second call to detect_encoding() fails, because f.readline() returns str. In Lib/test/test_tokenize.py, it seems like tokenize.open() is only tested to open a file. Its output isn't passed to tokenize.tokenize(). Most of the tests either pass the readline() method of open(..., 'rb') or io.BytesIO() to tokenize.tokenize(). I will submit a documentation PR that suggests to use tokenize.generate_tokens() with tokenize.open().

The original problem has already been solved by making tokenize.generate_tokens() public in issue 12486.

However, the same exception can be raised when tokenize.open() is used with tokenize.tokenize(), because it returns a text stream:

    https://github.com/python/cpython/blob/da63b321f63b697f75e7ab2f88f55d907f56c187/Lib/tokenize.py#L396

hello.py
--------

def say_hello():
    print("Hello, World!")

say_hello()


text.py
-------

import tokenize

with tokenize.open('hello.py') as f:
    token_gen = tokenize.tokenize(f.readline)
    for token in token_gen:
        print(token)

When we pass f.readline to tokenize.tokenize(), the second call to detect_encoding() fails, because f.readline() returns str.

In Lib/test/test_tokenize.py, it seems like tokenize.open() is only tested to open a file. Its output isn't passed to tokenize.tokenize(). Most of the tests either pass the readline() method of open(..., 'rb') or io.BytesIO() to tokenize.tokenize().

I will submit a documentation PR that suggests to use tokenize.generate_tokens() with tokenize.open().

History
Date	User	Action	Args
2019-04-28 13:07:56	berker.peksag	set	recipients: + berker.peksag, vstinner, r.david.murray, docs@python, bignose, Pod
2019-04-28 13:07:56	berker.peksag	set	messageid: <1556456876.01.0.873612598293.issue23297@roundup.psfhosted.org>
2019-04-28 13:07:55	berker.peksag	link	issue23297 messages
2019-04-28 13:07:55	berker.peksag	create