Author tim.peters
Recipients
Date 2006-07-02.13:13:38
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=31435

Yes, it's easy to provoke exponential-time behavior.  For a
simple example, the regexp

^((a+)*)*$

takes O(3**n) time to fail to match strings of the form

"a"*n + "b"

Python's matcher is a backtracking engine, much like Perl's,
and most other languages' non-POSIX re facilities.  There's
nary a DFA in sight ;-)  Jeffrey Friedl's thick O'Reilly
book "Mastering Regular Expressions" is mostly about the
pragmatics of using such engines efficiently:

http://regex.info/

Note that there's no current hope that this will change: 
because of gimmicks like backreferences, these aren't
CompSci's idea of regular expressions, and no "thoroughly
efficient" implementation technique is known.  For example,
this teensy regexp:

^(aa+)\\1+$

matches strings of a's whose length isn't prime, and finds a
non-trivial factor when the length is composite.  For
harder-to-solve but messier examples:

http://perl.plover.com/NPC/
History
Date User Action Args
2007-08-23 14:40:58adminlinkissue1515829 messages
2007-08-23 14:40:58admincreate