Author glchapman
Recipients
Date 2003-06-16.18:44:34
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Logged In: YES 
user_id=86307

FWIW, one way to efficiently match this sort of pattern 
(where you have literal text at the end of something 
complex) would be to use the match once and don't 
backtrack operator ('(?>pattern)').  SRE doesn't have the 
(?> operator, but it can be emulated by '(?=(pattern))\1'.  
So one way to avoid the exponential time would be to use 
something like this:

rx = re.compile(
   r'''^UPDATE\s+\w+\s+SET\s+locked_until\s*=\s*
   (?=(
       ([^=,]+\s*)+
   ))\1
   ,         #trailing literal
   ''', 
   re.I | re.X
)

By the way, it would be fairly easy to add the (?> operator to 
SRE; it's almost identical to a postive lookahead assertion.  
The only difference is that, upon successfully matching the 
subexpression inside the parentheses, the position in the 
string being matched is advanced to the end of the text 
matched by the subexpression.

Also, in case anyone's interested in why Perl is able to fail so 
quickly here, it appears that the Perl engine keeps track of 
positions (in the target text) where something like 
'(pattern)+' has already been tried and has failed, so it can 
quickly fail if backtracking causes an attempt to match again 
at that position (at least that's my interpretation of the 
numerous "already tried at this position..." messages in the 
trace from running the supplied pattern and text with "use 
re 'debug';").
History
Date User Action Args
2007-08-23 14:13:52adminlinkissue753711 messages
2007-08-23 14:13:52admincreate