This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: utf_8_sig decode fails with buffer input
Type: Stage:
Components: Library (Lib) Versions: Python 2.5
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: bazwal, doerwalter
Priority: normal Keywords:

Created on 2006-11-23 02:38 by bazwal, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg30655 - (view) Author: bazwal (bazwal) Date: 2006-11-23 02:38
when the decode function in encodings.utf_8_sig receives a buffer object, it fails because it tries to check for a bom using startswith:

>>> unicode('\xef\xbb\xbf', 'utf_8_sig')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.5/encodings/", line 19, in decode
    if input.startswith(codecs.BOM_UTF8):
AttributeError: 'buffer' object has no attribute 'startswith'

the test should be changed to:

if input[:3] == codecs.BOM_UTF8:

msg30656 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-11-23 04:00
Can you provide a test that fails?
msg30657 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-11-23 05:04
Oops, I missed your stacktrace. Fixed in r52826.

(A better fix might be to add startswith() to buffer).
msg30658 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-11-23 05:06
Fixed in r52827 for the 2.5 branch
Date User Action Args
2022-04-11 14:56:21adminsetgithub: 44266
2006-11-23 02:38:36bazwalcreate