This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author hroncok
Recipients ezio.melotti, hroncok, vstinner
Date 2019-06-24.14:10:51
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1561385452.16.0.600106929179.issue37388@roundup.psfhosted.org>
In-reply-to
Content
I was just bit by specifying an nonexisitng error handler for bytes.decode() without noticing.

Consider this code:

>>> 'a'.encode('cp1250').decode('utf-8', errors='Boom, Shaka Laka, Boom!')
'a'

Nobody notices that the error handler doesn't exist.

However:

>>> 'ž'.encode('cp1250').decode('utf-8', errors='Boom, Shaka Laka, Boom!')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: unknown error handler name 'Boom, Shaka Laka, Boom!'


The error is only noticeable once there is an error in the data.

While nobody could possibly mistake 'Boom, Shaka Laka, Boom!' for a valid error handler, I was bit by this:

>>> b.decode('utf-8', errors='surrogate')

Which in fact should have been

>>> b.decode('utf-8', errors='surrogateescape')

Yet I wasn't notified, because the bytes in question were actually decodeable as valid utf-8.

I suggest that unknown error handler should rise an exception immediately like this:

>>> 'b'.encode('cp1250').decode('utf-8', errors='Boom, Shaka Laka, Boom!')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: unknown error handler name 'Boom, Shaka Laka, Boom!'
History
Date User Action Args
2019-06-24 14:10:52hroncoksetrecipients: + hroncok, vstinner, ezio.melotti
2019-06-24 14:10:52hroncoksetmessageid: <1561385452.16.0.600106929179.issue37388@roundup.psfhosted.org>
2019-06-24 14:10:52hroncoklinkissue37388 messages
2019-06-24 14:10:51hroncokcreate