Message360004
bpo-37751 changed codecs.lookup() in a subtle way: non-ASCII characters are now ignored, whereas they were copied unmodified previously.
I would prefer that codecs.lookup() and encodings.normalize_encoding() behave the same. Either always ignore or always copy.
Moreover, it seems like there is no test on how the encoding names are normalized in codecs.register(). I recall that using codecs.register() in an unit test causes troubles since there is no API to unregister a search function. Maybe we should just add a private function for test in _testcapi.
Serhiy Storchaka wrote an example on my PR:
https://github.com/python/cpython/pull/17997/files
> There are other differences. For example, normalize_encoding("КОИ-8") returns "кои_8", but codecs.lookup normalizes it to "8".
> The comment in the sources is also not correct. |
|
Date |
User |
Action |
Args |
2020-01-14 21:54:42 | vstinner | set | recipients:
+ vstinner, lemburg, serhiy.storchaka |
2020-01-14 21:54:42 | vstinner | set | messageid: <1579038882.16.0.589810918272.issue39337@roundup.psfhosted.org> |
2020-01-14 21:54:42 | vstinner | link | issue39337 messages |
2020-01-14 21:54:41 | vstinner | create | |
|