This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: codec name acceptance became way too lenient in 3.9
Type: behavior Stage: needs patch
Components: Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: gregory.p.smith
Priority: normal Keywords: 3.9regression

Created on 2022-01-25 00:12 by gregory.p.smith, last changed 2022-04-11 14:59 by admin.

Messages (2)
msg411535 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-01-25 00:12
in 3.8 this was not a valid codec name: "เ_เ_เ_iDnA"
in 3.9 it gets treated as idna and triggers the punycode decoder when passed to bytes.decode(codec).

Discovered by oss-fuzz.

_Likely_ a consequence of https://bugs.python.org/issue37751

The consequences of this change are that anyone can stuff heinous strings into codec names and get a non-LookupError behavior out of them. Anywhere codecs can be part of user input this has many interesting potential negative consequences.

<=3.8 gave `LookupError("unknown encoding: ...`
msg411540 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-01-25 00:37
while figuring this issue out, it may also make sense to address https://bugs.python.org/issue44723 as well.
History
Date User Action Args
2022-04-11 14:59:55adminsetgithub: 90666
2022-01-25 00:37:48gregory.p.smithsetmessages: + msg411540
2022-01-25 00:12:22gregory.p.smithcreate