This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author malin
Recipients abacabadabacaba, alexei.romanov, ezio.melotti, malin, mrabarnett, serhiy.storchaka, vstinner
Date 2019-03-04.13:15:33
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1551705334.03.0.403109792165.issue23689@roundup.psfhosted.org>
In-reply-to
Content
PR11926 (closed) tried to allocate SRE_REPEAT on state's stack.
It's feasible, but messes up the code in sre_lib.h, and reduces performance a little (roughly 6% slower), so I gave up this solution.

PR12160 uses a memory pool, this solution doesn't mess up the code.

🔸For infrequent alloc/free scenes, it adds a small overhead:

s = 'a'
p = re.compile(r'(a)?')
p.match(s)  # <- measure this statement

before patch: 316 ns  +- 19 ns
after patch:  324 ns  +- 11 ns, 2.5% slower.
(by perf module)

🔸For very frequent alloc/free scenes, it brings a speedup:

s = 200_000_000 * 'a'
p = re.compile(r'.*?(?:bb)+')
p.match(s)  # <- measure this statement

before patch: 7.16 sec
after patch:  5.82 sec, 18.7% faster.
(best of 10 tests)

🔸I tested in a real case that use 17 patterns to process 100MB data:

before patch: 27.09 sec
after patch:  26.78 sec, 1.1% faster.
(best of 4 tests)
History
Date User Action Args
2019-03-04 13:15:34malinsetrecipients: + malin, vstinner, ezio.melotti, mrabarnett, abacabadabacaba, serhiy.storchaka, alexei.romanov
2019-03-04 13:15:34malinsetmessageid: <1551705334.03.0.403109792165.issue23689@roundup.psfhosted.org>
2019-03-04 13:15:34malinlinkissue23689 messages
2019-03-04 13:15:33malincreate