Author mrabarnett
Recipients akitada, akuchling, amaury.forgeotdarc, collinwinter, doerwalter, ezio.melotti, georg.brandl, gregory.p.smith, jaylogan, jimjjewett, loewis, mark, moreati, mrabarnett, nneonneo, pitrou, rsc, timehorse
Date 2009-07-26.19:11:51
SpamBayes Score 7.73928e-10
Marked as misclassified No
Message-id <1248635513.59.0.91169053599.issue2636@psf.upfronthosting.co.za>
In-reply-to
Content
issue2636-20090726.zip is a new implementation of the re engine. It
replaces re.py, sre.py, sre_constants.py, sre_parse.py and
sre_compile.py with a new re.py and replaces sre_constants.h, sre.h and
_sre.c with _re.h and _re.c.

The internal engine no longer interprets a form of bytecode but instead
follows a linked set of nodes, and it can work breadth-wise as well as
depth-first, which makes it perform much better when faced with one of
those 'pathological' regexes.

It supports scoped flags, variable-length lookbehind, Unicode
properties, named characters, atomic groups, possessive quantifiers, and
will handle zero-width splits correctly when the ZEROWIDTH flag is set.

There are a few more things to add, like allowing indexing for capture
groups, and further speed improvements might be possible (at worst it's
roughly the same speed as the existing re module).

I'll be adding some documentation about how it works and the slight
differences in behaviour later.
History
Date User Action Args
2009-07-26 19:11:54mrabarnettsetrecipients: + mrabarnett, loewis, akuchling, doerwalter, georg.brandl, collinwinter, gregory.p.smith, jimjjewett, amaury.forgeotdarc, pitrou, nneonneo, rsc, timehorse, mark, ezio.melotti, jaylogan, akitada, moreati
2009-07-26 19:11:53mrabarnettsetmessageid: <1248635513.59.0.91169053599.issue2636@psf.upfronthosting.co.za>
2009-07-26 19:11:52mrabarnettlinkissue2636 messages
2009-07-26 19:11:51mrabarnettcreate