This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients ezio.melotti, mrabarnett, serhiy.storchaka, vstinner
Date 2013-10-24.19:24:57
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1382642699.0.0.831034299627.issue19329@psf.upfronthosting.co.za>
In-reply-to
Content
Here is a more complex patch which optimizes charset compiling. It affects small charsets too. Big charsets now supports same optimizations as small charsets. Optimized bitmap now can be used even if the charset contains category items or non-bmp characters.

$ ./python -m timeit "from sre_compile import compile; r = '[0-9]+'"  "compile(r, 0)"
Unpatched: 1000 loops, best of 3: 457 usec per loop
Patched: 1000 loops, best of 3: 368 usec per loop
$ ./python -m timeit "from sre_compile import compile; r = '[ \t\n\r\v\f]+'"  "compile(r, 0)"
Unpatched: 1000 loops, best of 3: 490 usec per loop
Patched: 1000 loops, best of 3: 413 usec per loop
$ ./python -m timeit "from sre_compile import compile; r = '[0-9A-Za-z_]+'"  "compile(r, 0)"
Unpatched: 1000 loops, best of 3: 760 usec per loop
Patched: 1000 loops, best of 3: 527 usec per loop
$ ./python -m timeit "from sre_compile import compile; r = r'[^\ud800-\udfff]*'"  "compile(r, 0)"
Unpatched: 100 loops, best of 3: 2.07 msec per loop
Patched: 1000 loops, best of 3: 1.44 msec per loop
$ ./python -m timeit "from sre_compile import compile; r = '[\u0410-\u042f\u0430-\u043f\u0404\u0406\u0407\u0454\u0456\u0457\u0490\u0491]+'"  "compile(r, 0)"
Unpatched: 100 loops, best of 3: 8.24 msec per loop
Patched: 100 loops, best of 3: 2.13 msec per loop
$ ./python -m timeit "from sre_compile import compile; r = '[%s]' % ''.join(map(chr, range(256, 2**16, 255)))"  "compile(r, 0)"
Unpatched: 10 loops, best of 3: 119 msec per loop
Patched: 10 loops, best of 3: 24.1 msec per loop
History
Date User Action Args
2013-10-24 19:24:59serhiy.storchakasetrecipients: + serhiy.storchaka, vstinner, ezio.melotti, mrabarnett
2013-10-24 19:24:59serhiy.storchakasetmessageid: <1382642699.0.0.831034299627.issue19329@psf.upfronthosting.co.za>
2013-10-24 19:24:58serhiy.storchakalinkissue19329 messages
2013-10-24 19:24:58serhiy.storchakacreate