This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author bignose
Recipients bignose
Date 2015-01-22.04:40:26
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1421901626.89.0.392776519963.issue23297@psf.upfronthosting.co.za>
In-reply-to
Content
In `tokenize.detect_encoding` is the following code::

    first = read_or_stop()
    if first.startswith(BOM_UTF8):
        # …

The `read_or_stop` function is defined as::

    def read_or_stop():
        try:
            return readline()
        except StopIteration:
            return b''

So, on catching ``StopIteration``, the return value will be a byte string. The `detect_encoding` code then immediately calls `sartswith`, which fails::

    File "/usr/lib/python3.4/tokenize.py", line 409, in detect_encoding
      if first.startswith(BOM_UTF8):
  TypeError: startswith first arg must be str or a tuple of str, not bytes

One or both of those locations in the code is wrong. Either `read_or_stop` should never return a byte string; or `detect_encoding` should not assume it can call `startswith` on the result.
History
Date User Action Args
2015-01-22 04:40:26bignosesetrecipients: + bignose
2015-01-22 04:40:26bignosesetmessageid: <1421901626.89.0.392776519963.issue23297@psf.upfronthosting.co.za>
2015-01-22 04:40:26bignoselinkissue23297 messages
2015-01-22 04:40:26bignosecreate