classification
Title: Computed-goto patch for RE engine
Type: performance Stage: patch review
Components: Regular Expressions Versions: Python 3.4
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: akuchling, cdleary, ezio.melotti, georg.brandl, loewis, pitrou
Priority: normal Keywords: needs review, patch

Created on 2009-12-29 02:11 by akuchling, last changed 2013-11-06 20:23 by akuchling. This issue is now closed.

Files
File name Uploaded Description Edit
re-computed-goto-v1.txt akuchling, 2009-12-29 02:12 re-computed-goto-v1.txt
Messages (6)
msg96984 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2009-12-29 02:11
Part of Unladen Swallow's roadmap is to use a threaded-interpreter
technique for the regular expression engine.  That sounded like an
interesting idea, so I went ahead and tried to implement it.

The current patch is attached.  To try it: run configure
--with-computed-gotos; apply the patch; and compile Python.

Still to do:

* Benchmarking, to determine if it's actually an improvement.

* Possibly integrate construction of the Modules/re_opcodes.h file into
the build process.  The current file is supplied; to rebuild it, run
"python Modules/makereopcodes.py > Modules/re_opcodes.h". (But it only
needs rebuilding if you add new regex opcodes, which is rarely done --
maybe it's sufficient to include a reminder to update it plus the script).
msg97038 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-12-30 16:36
On the principle nothing looks wrong. There are some tabs-vs-spaces
issues in _sre.c. Can some out-of-bounds crash be triggered if an opcode
is greater than 33?

It needs some benchmarks to know whether it's efficient. Also, I think
it would be nice to include regeneration of the opcode table in the
build process. Since it's very light-weight it can be done
unconditionally (I suppose it will be done from setup.py, right?).
msg97097 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-12-31 14:54
_sre is listed in Modules/Setup, so it will be a built-in module by default.
msg99471 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2010-02-17 14:36
I finally got around to benchmarking this change, and unfortunately the results are not good.

I used the regex tests in the Unladen Swallow test suite, regex_effbot and regex_v8.  The tests are written for Python 2.x, but the fixes for 3.x are straightforward (use print() in one function; replace xrange with range in the bm_regex_effbot.py and bm_regex_v8.py files; remove a few uses of u''; I'll provide a patch later.)

Hardware: MacBook, Intel Core 2 Duo, 1.83GHz, 2MB L2 cache, 667 MHz bus.

Tests invoked with ./perf.py -b regex_effbot -r -v ../py3k/python.exe ../threaded-3000/python.exe

regex_effbot is 1.1002 times slower with the computed-goto patch, and 
regex_v8 is 1.0081x slower.  perf.py indicates that both results are significant.

I'd like to see a few people replicate these results -- maybe the effect is very platform dependent or I ran the tests incorrectly -- but on current evidence, this patch is not worth pursuing.
msg99473 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2010-02-17 15:13
Actually, I really want someone to verify that measurement.  As a control, I tried running the call_method benchmark (after a few more xrange fixes).  The Python 3.x trunk version with my patch is measured as 1.0227x slower, even though the patch only touches the re module and call_method doesn't use the module at all.  I recompiled both binaries; both builds are using the same compiler arguments; both have the same version from trunk.  I'm mystified about why the patched version is slower.
msg99478 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-02-17 18:50
You should disassemble the output (or produce assembler from gcc) and check that the various indirect jumps at the end of each case block don't get merged into a single shared indirect jump.

Or perhaps it's simply that regular expression matching isn't really sensitive to bytecode dispatch overhead.
History
Date User Action Args
2013-11-06 20:23:25akuchlingsetstatus: open -> closed
resolution: wont fix
2012-07-25 22:46:32ezio.melottisetversions: + Python 3.4, - Python 3.2
2010-05-28 23:30:55cdlearysetnosy: + cdleary
2010-02-17 18:50:13pitrousetmessages: + msg99478
2010-02-17 15:13:57akuchlingsetmessages: + msg99473
2010-02-17 14:36:51akuchlingsetmessages: + msg99471
2009-12-31 14:54:16georg.brandlsetnosy: + georg.brandl
messages: + msg97097
2009-12-30 16:36:48pitrousetmessages: + msg97038
2009-12-30 16:19:50pitrousetnosy: + pitrou
2009-12-30 15:48:22georg.brandlsettitle: Computed-goto patch -> Computed-goto patch for RE engine
2009-12-29 07:07:28loewissetnosy: + loewis
2009-12-29 02:15:57ezio.melottisetnosy: + ezio.melotti
priority: normal
keywords: + patch, needs review
type: performance
stage: patch review
2009-12-29 02:12:06akuchlingsetfiles: + re-computed-goto-v1.txt
2009-12-29 02:11:30akuchlingcreate