Message 360004 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	lemburg, serhiy.storchaka, vstinner
Date	2020-01-14.21:54:41
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1579038882.16.0.589810918272.issue39337@roundup.psfhosted.org>
In-reply-to

Content
bpo-37751 changed codecs.lookup() in a subtle way: non-ASCII characters are now ignored, whereas they were copied unmodified previously. I would prefer that codecs.lookup() and encodings.normalize_encoding() behave the same. Either always ignore or always copy. Moreover, it seems like there is no test on how the encoding names are normalized in codecs.register(). I recall that using codecs.register() in an unit test causes troubles since there is no API to unregister a search function. Maybe we should just add a private function for test in _testcapi. Serhiy Storchaka wrote an example on my PR: https://github.com/python/cpython/pull/17997/files > There are other differences. For example, normalize_encoding("КОИ-8") returns "кои_8", but codecs.lookup normalizes it to "8". > The comment in the sources is also not correct.

bpo-37751 changed codecs.lookup() in a subtle way: non-ASCII characters are now ignored, whereas they were copied unmodified previously.

I would prefer that codecs.lookup() and encodings.normalize_encoding() behave the same. Either always ignore or always copy.

Moreover, it seems like there is no test on how the encoding names are normalized in codecs.register(). I recall that using codecs.register() in an unit test causes troubles since there is no API to unregister a search function. Maybe we should just add a private function for test in _testcapi.

Serhiy Storchaka wrote an example on my PR:
https://github.com/python/cpython/pull/17997/files

> There are other differences. For example, normalize_encoding("КОИ-8") returns "кои_8", but codecs.lookup normalizes it to "8".

> The comment in the sources is also not correct.

History
Date	User	Action	Args
2020-01-14 21:54:42	vstinner	set	recipients: + vstinner, lemburg, serhiy.storchaka
2020-01-14 21:54:42	vstinner	set	messageid: <1579038882.16.0.589810918272.issue39337@roundup.psfhosted.org>
2020-01-14 21:54:42	vstinner	link	issue39337 messages
2020-01-14 21:54:41	vstinner	create