classification
Title: Redundant Py_CHARMASK called in some files
Type: performance Stage: resolved
Components: Unicode Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, malin, qigangxu, shihai1991, steve.dower, vstinner
Priority: normal Keywords: patch, patch

Created on 2019-08-03 12:10 by qigangxu, last changed 2019-09-10 16:04 by steve.dower. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 15093 closed qigangxu, 2019-08-03 13:49
PR 15093 closed qigangxu, 2019-08-03 13:49
PR 15095 merged qigangxu, 2019-08-03 14:57
Messages (14)
msg348955 - (view) Author: Jordon.X (qigangxu) * Date: 2019-08-03 12:10
In normalizestring(),
            ch = Py_TOLOWER(Py_CHARMASK(ch));
Where Py_TOLOWER is defined as following,
            #define Py_TOLOWER(c) (_Py_ctype_tolower[Py_CHARMASK(c)])

Redundant Py_CHARMASK called here.
msg348957 - (view) Author: hai shi (shihai1991) * Date: 2019-08-03 12:58
Good catch ;), try to fix it, thanks.
msg348960 - (view) Author: hai shi (shihai1991) * Date: 2019-08-03 14:00
Looks the PR is confused. Make sure: one PR on one dev branch.
msg348969 - (view) Author: Jordon.X (qigangxu) * Date: 2019-08-03 23:57
Thanks Hai, for your careful check. There is a misoperation in PR 15093. Now I closed PR 15093. And resubmit code as PR 15095.
msg348970 - (view) Author: Ma Lin (malin) * Date: 2019-08-04 00:57
Search "Py_CHARMASK" in Python source code, there are more than a dozen Py_CHARMASK can be deleted:

https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Python/mystrtoul.c#L102

https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Python/dynload_aix.c#L143

https://github.com/python/cpython/blob/530f506ac91338b55cf2be71b1cdf50cb077512f/Modules/unicodedata.c#L963

...
msg348971 - (view) Author: Ma Lin (malin) * Date: 2019-08-04 01:18
Or remove Py_CHARMASK in Py_ISxxx/Py_TOLOWER/Py_TOUPPER macros?
Sometimes `c` is already a unsinged char, Py_CHARMASK is not necessary in these cases.
msg348973 - (view) Author: Jordon.X (qigangxu) * Date: 2019-08-04 03:13
Thanks Ma,I also think all these redundant Py_CHARMASK should be deleted.
msg349047 - (view) Author: Jordon.X (qigangxu) * Date: 2019-08-05 11:45
After the full search for the project code, there are more than a dozen similar issues that need to be modified. A new PR will be added later.
msg349102 - (view) Author: Ma Lin (malin) * Date: 2019-08-06 08:01
VC2017 optimizes multiple `unsigned char)((c) & 0xff` to a single `movzx` operation, maybe other compilers do it as well.
If so, there will be no performance changes.
msg349288 - (view) Author: Jordon.X (qigangxu) * Date: 2019-08-09 14:15
Hi Ma, 
    I think this is a very valuable suggestion.
    And if we want to delete the definition of Py_CHARMASK, maybe another issue needed.
    I am not very sure about the considerations of this macro definition. And whether there are any side effects after deletion needs further discussion.

Thanks,
Best Regards
msg349297 - (view) Author: hai shi (shihai1991) * Date: 2019-08-09 17:03
Due to this macro documented in https://docs.python.org/3.9/c-api/intro.html#c.Py_CHARMASK and we don't know how much user use this marco, I don't think we need delte this function(deleting operation should be careful).
msg349298 - (view) Author: hai shi (shihai1991) * Date: 2019-08-09 17:03
typo error: delte->delete
msg351717 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-09-10 16:04
New changeset 2ec70102066fe5534f1a62e8f496d2005e1697db by Steve Dower (Jordon Xu) in branch 'master':
bpo-37752: Delete redundant Py_CHARMASK in normalizestring() (GH-15095)
https://github.com/python/cpython/commit/2ec70102066fe5534f1a62e8f496d2005e1697db
msg351718 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-09-10 16:04
Thanks for the patch!
History
Date User Action Args
2019-09-10 16:04:27steve.dowersetstatus: open -> closed
messages: + msg351718

keywords: patch, patch
resolution: fixed
stage: patch review -> resolved
2019-09-10 16:04:11steve.dowersetnosy: + steve.dower
messages: + msg351717
2019-08-09 17:03:56shihai1991setmessages: + msg349298
2019-08-09 17:03:12shihai1991setmessages: + msg349297
2019-08-09 14:15:50qigangxusetmessages: + msg349288
2019-08-06 08:01:26malinsetmessages: + msg349102
2019-08-05 11:45:40qigangxusetmessages: + msg349047
title: Redundant Py_CHARMASK called in normalizestring(codecs.c) -> Redundant Py_CHARMASK called in some files
2019-08-04 03:13:49qigangxusetmessages: + msg348973
2019-08-04 01:18:51malinsetmessages: + msg348971
2019-08-04 00:57:37malinsetnosy: + malin
messages: + msg348970
2019-08-03 23:57:03qigangxusetmessages: + msg348969
2019-08-03 14:57:01qigangxusetpull_requests: + pull_request14842
2019-08-03 14:00:14shihai1991setmessages: + msg348960
2019-08-03 13:49:25qigangxusetkeywords: + patch
stage: patch review
pull_requests: + pull_request14840
2019-08-03 13:49:25qigangxusetkeywords: + patch
stage: (no value)
pull_requests: + pull_request14839
2019-08-03 12:58:48shihai1991setnosy: + shihai1991
messages: + msg348957
2019-08-03 12:10:17qigangxucreate