classification
Title: Speed up re.match with pre-compiled patterns
Type: performance Stage: resolved
Components: Regular Expressions Versions: Python 3.11
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, jonash, mrabarnett, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2021-10-13 16:38 by jonash, last changed 2021-10-15 09:22 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 28936 closed jonash, 2021-10-13 16:41
Messages (6)
msg403851 - (view) Author: Jonas H. (jonash) * Date: 2021-10-13 16:38
re.match(p, ...) with a pre-compiled pattern p = re.compile(...) can be much slower than calling p.match(...). Probably mostly in cases with "easy" patterns and/or short strings.

The culprit is that re.match -> re._compile can spend a lot of time looking up p its internal _cache, where it will never find p:

def _compile(pattern, flags):
    ...
    try:
        return _cache[type(pattern), pattern, flags]
    except KeyError:
        pass
    if isinstance(pattern, Pattern):
        ...
        return pattern
    ...
        _cache[type(pattern), pattern, flags] = p
    ...

_compile will always return before the _cache is set if given a Pattern object.

By simply reordering the isinstance(..., Pattern) check we can safe a lot of time.

I've seen speedups in the range of 2x-5x on some of my data. As an example:

Raw speed of re.compile(p, ...).match():
    time ./python.exe -c 'import re'\n'pat = re.compile(".").match'\n'for _ in range(1_000_000): pat("asdf")'
    Executed in  190.59 millis

Speed with this optimization:
    time ./python.exe -c 'import re'\n'pat = re.compile(".")'\n'for _ in range(1_000_000): re.match(pat, "asdf")'
    Executed in  291.39 millis

Speed without this optimization:
    time ./python.exe -c 'import re'\n'pat = re.compile(".")'\n'for _ in range(1_000_000): re.match(pat, "asdf")'
    Executed in  554.42 millis
msg403925 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-10-14 17:58
Calling re.math() with a pre-compiled pattern is an uncommon case. Common cases are calling re.math() with a string pattern and calling the math() method of a pre-compiled pattern.

Your change speeds up an uncommon case but slows down a common case.
msg403982 - (view) Author: Jonas H. (jonash) * Date: 2021-10-15 07:43
I agree with your statement in principle. Here are numbers for the slowdown that's introduced:

Without the change:
  ./python.exe -m timeit -s 'import re'\n'[re.compile(f"fill_cache{i}") for i in range(512)]'\n'pat = re.compile(".")' 're.match(pat, "asdf")'
  500000 loops, best of 5: 462 nsec per loop
  ./python.exe -m timeit -s 'import re'\n'[re.compile(f"fill_cache{i}") for i in range(512)]'\n'pat = re.compile(".")' 're.match(".", "asdf")'
  1000000 loops, best of 5: 316 nsec per loop

With the change:
  ./python.exe -m timeit -s 'import re'\n'[re.compile(f"fill_cache{i}") for i in range(512)]'\n'pat = re.compile(".")' 're.match(pat, "asdf")'
1000000 loops, best of 5: 207 nsec per loop
  ./python.exe -m timeit -s 'import re'\n'[re.compile(f"fill_cache{i}") for i in range(512)]'\n'pat = re.compile(".")' 're.match(".", "asdf")'
1000000 loops, best of 5: 351 nsec per loop

So we have a 2x speedup in the uncommon case and a 10% slowdown in the common case.
msg403987 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-10-15 08:09
And compare it with pat.match("asdf").

If the performance is critical, use methods of pre-compiled patterns.
msg403989 - (view) Author: Jonas H. (jonash) * Date: 2021-10-15 08:46
pat.match() has 110 nsec.

Feel free to close the issue and PR if you think this isn't worth changing.
msg403991 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-10-15 09:22
For reference, caching was introduced in b1aa19515ffdb84c6633ee0344196fd8bd50ade0 21 years ago, and initially it checked for pre-compiled patterns before looking up in the cache. But it was changed 2 months later in 7898c3e6852565046a9b8b063d35d66777bf5176 and since then the cache was checked first. There was no explicit note about this in commit message, but I think that it was done to speed up the common case.

There were many changes in the caching mechanism, but this part of logic was left unchanged. Maybe if once we implement fast dispatch by the type of the first argument we reconsider this code.
History
Date User Action Args
2021-10-15 09:22:46serhiy.storchakasetstatus: open -> closed
resolution: rejected
messages: + msg403991

stage: patch review -> resolved
2021-10-15 08:46:57jonashsetmessages: + msg403989
2021-10-15 08:09:55serhiy.storchakasetmessages: + msg403987
2021-10-15 07:43:39jonashsetmessages: + msg403982
2021-10-14 17:58:16serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg403925
2021-10-13 16:41:21jonashsetkeywords: + patch
stage: patch review
pull_requests: + pull_request27224
2021-10-13 16:38:53jonashcreate