Message346407
I was just bit by specifying an nonexisitng error handler for bytes.decode() without noticing.
Consider this code:
>>> 'a'.encode('cp1250').decode('utf-8', errors='Boom, Shaka Laka, Boom!')
'a'
Nobody notices that the error handler doesn't exist.
However:
>>> 'ž'.encode('cp1250').decode('utf-8', errors='Boom, Shaka Laka, Boom!')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
LookupError: unknown error handler name 'Boom, Shaka Laka, Boom!'
The error is only noticeable once there is an error in the data.
While nobody could possibly mistake 'Boom, Shaka Laka, Boom!' for a valid error handler, I was bit by this:
>>> b.decode('utf-8', errors='surrogate')
Which in fact should have been
>>> b.decode('utf-8', errors='surrogateescape')
Yet I wasn't notified, because the bytes in question were actually decodeable as valid utf-8.
I suggest that unknown error handler should rise an exception immediately like this:
>>> 'b'.encode('cp1250').decode('utf-8', errors='Boom, Shaka Laka, Boom!')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
LookupError: unknown error handler name 'Boom, Shaka Laka, Boom!' |
|
Date |
User |
Action |
Args |
2019-06-24 14:10:52 | hroncok | set | recipients:
+ hroncok, vstinner, ezio.melotti |
2019-06-24 14:10:52 | hroncok | set | messageid: <1561385452.16.0.600106929179.issue37388@roundup.psfhosted.org> |
2019-06-24 14:10:52 | hroncok | link | issue37388 messages |
2019-06-24 14:10:51 | hroncok | create | |
|