This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author berker.peksag
Recipients Pod, berker.peksag, bignose, docs@python, r.david.murray, vstinner
Date 2019-04-28.13:07:55
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1556456876.01.0.873612598293.issue23297@roundup.psfhosted.org>
In-reply-to
Content
The original problem has already been solved by making tokenize.generate_tokens() public in issue 12486.

However, the same exception can be raised when tokenize.open() is used with tokenize.tokenize(), because it returns a text stream:

    https://github.com/python/cpython/blob/da63b321f63b697f75e7ab2f88f55d907f56c187/Lib/tokenize.py#L396

hello.py
--------

def say_hello():
    print("Hello, World!")

say_hello()


text.py
-------

import tokenize

with tokenize.open('hello.py') as f:
    token_gen = tokenize.tokenize(f.readline)
    for token in token_gen:
        print(token)

When we pass f.readline to tokenize.tokenize(), the second call to detect_encoding() fails, because f.readline() returns str.

In Lib/test/test_tokenize.py, it seems like tokenize.open() is only tested to open a file. Its output isn't passed to tokenize.tokenize(). Most of the tests either pass the readline() method of open(..., 'rb') or io.BytesIO() to tokenize.tokenize().

I will submit a documentation PR that suggests to use tokenize.generate_tokens() with tokenize.open().
History
Date User Action Args
2019-04-28 13:07:56berker.peksagsetrecipients: + berker.peksag, vstinner, r.david.murray, docs@python, bignose, Pod
2019-04-28 13:07:56berker.peksagsetmessageid: <1556456876.01.0.873612598293.issue23297@roundup.psfhosted.org>
2019-04-28 13:07:55berker.peksaglinkissue23297 messages
2019-04-28 13:07:55berker.peksagcreate