classification
Title: In codecs, function 'normalizestring' should convert both spaces and hyphens to underscores.
Type: behavior Stage: resolved
Components: Unicode Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, lemburg, qigangxu, shihai1991, vstinner
Priority: normal Keywords: patch

Created on 2019-08-03 11:34 by qigangxu, last changed 2019-08-22 04:42 by qigangxu. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 15092 merged qigangxu, 2019-08-03 12:45
Messages (8)
msg348953 - (view) Author: Jordon.X (qigangxu) * Date: 2019-08-03 11:34
In codecs.c,  when _PyCodec_Lookup() call normalizestring(), both spaces and hyphens should be convered to underscores. Not convert spaces to hyphens.

see:https://github.com/python/peps/blob/master/pep-0100.txt, Codecs (Coder/Decoders) Lookup
msg348954 - (view) Author: Jordon.X (qigangxu) * Date: 2019-08-03 11:55
and I will try to fix it.
msg348956 - (view) Author: hai shi (shihai1991) * Date: 2019-08-03 12:57
Hm, there is a bit misleading between desc(https://github.com/python/cpython/blob/master/Python/codecs.c#L53) and the code (https://github.com/python/cpython/blob/master/Python/codecs.c#L74).
msg348959 - (view) Author: Jordon.X (qigangxu) * Date: 2019-08-03 13:13
The design and code of the following four places need to be consistent,

No.1 https://github.com/python/peps/blob/master/pep-0100.txt#L292
No.2 https://github.com/python/cpython/blob/master/Python/codecs.c#L113
No.3 https://github.com/python/cpython/blob/master/Python/codecs.c#L53  
No.4 https://github.com/python/cpython/blob/master/Python/codecs.c#74
msg349448 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2019-08-12 08:37
Jordon is right. Conversion has to be to underscores, not hyphens. I guess this bug was introduced when the normalization function was converted to C.
msg350086 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-21 13:26
New changeset 20f59fe1f7748ae899aceee4cb560e5e1f528a1f by Victor Stinner (Jordon Xu) in branch 'master':
bpo-37751: Fix codecs.lookup() normalization (GH-15092)
https://github.com/python/cpython/commit/20f59fe1f7748ae899aceee4cb560e5e1f528a1f
msg350087 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-21 13:27
Thanks for the fix Jordon Xu.

IMHO this change is not strictly a bugfix, but more like an enhancement. I close the issue.

If you consider that a backport to Python 3.7 and 3.8 is needed, please say so.
msg350155 - (view) Author: Jordon.X (qigangxu) * Date: 2019-08-22 04:42
Thanks vstinner. I also don't think it's necessary to backport to the old version. Close this issue is fine.
History
Date User Action Args
2019-08-22 04:42:11qigangxusetmessages: + msg350155
2019-08-21 13:27:23vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg350087

stage: patch review -> resolved
2019-08-21 13:26:33vstinnersetmessages: + msg350086
2019-08-12 08:37:08lemburgsetnosy: + lemburg
messages: + msg349448
2019-08-03 13:13:33qigangxusetmessages: + msg348959
2019-08-03 12:57:14shihai1991setnosy: + shihai1991
messages: + msg348956
2019-08-03 12:45:33qigangxusetkeywords: + patch
stage: patch review
pull_requests: + pull_request14838
2019-08-03 11:55:54qigangxusetmessages: + msg348954
2019-08-03 11:34:13qigangxucreate