This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: re._compile should check if the argument is a compiled pattern before checking cache and flags
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.9, Python 3.8
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Recursing, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2020-01-02 18:59 by Recursing, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 17799 closed Recursing, 2020-01-02 19:01
Messages (4)
msg359210 - (view) Author: (Recursing) * Date: 2020-01-02 18:59
In the re module, re._compile gets called when using most re methods.
In my use case (which I think is not rare) I have a small number of compiled patterns that I have to match against a large number of short strings

profiling showed that half of the total runtime was still spent in re._compile, checking for the type of the flags and trying to get the pattern in a cache

Example code that exhibits this behavior:

import re

pattern = re.compile("spam")
string = "Monty pythons"
for _ in range(1000000):
    re.search(pattern, string)
msg359213 - (view) Author: (Recursing) * Date: 2020-01-02 19:24
I now know that the correct and fastest way to do this is to use pattern.search, but I still think this change would be an improvement
msg359215 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-01-02 19:51
PR 17799 improves the performance of an uncommon case at the cost of reducing the performance of a common case. I doubt this is a good change.

If you have a compiled pattern, it is better to use its methods.
msg359218 - (view) Author: (Recursing) * Date: 2020-01-02 20:10
Rereading the documentation it's actually well documented, and I fully agree.

Thanks for your time
History
Date User Action Args
2022-04-11 14:59:24adminsetgithub: 83376
2020-01-02 20:10:00Recursingsetstatus: open -> closed

messages: + msg359218
stage: patch review -> resolved
2020-01-02 19:51:58serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg359215
2020-01-02 19:24:45Recursingsetmessages: + msg359213
2020-01-02 19:01:14Recursingsetkeywords: + patch
stage: patch review
pull_requests: + pull_request17234
2020-01-02 18:59:30Recursingcreate