Author tim.peters
Recipients kleshni, tim.peters
Date 2020-05-04.00:00:38
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1588550438.45.0.0121375131274.issue40480@roundup.psfhosted.org>
In-reply-to
Content
"The trick" with this kind of pattern is that a `*` should consume as little as possible to allow the next fixed portion to match.  Apply that at every stage, and it succeeds or it doesn't.  Backtracking can't change that outcome.  For, e.g., the shell pattern:

a*bc*d*

a regexp to do this without backtracking would be like this, IF Python's re engine supported "atomic groups" (but it doesn't):

a(?>.*?bc)(?>.*?d).*\Z

The same effect can be gotten in a more convoluted way, via positive lookahead assertions and backreferencing (which Python's re engine does support):

a(?=(.*?bc))\1(?=(.*?d))\2.*\Z

Assertions are "one and done":  if the overall match fails, assertions don't try alternatives.

So that's _a_ way to proceed.  I'm unclear on that it's worth it, though.  Stuff like "*a*a*a*a*a*a*" is just hard to swallow as a shell pattern that would occur in real life.
History
Date User Action Args
2020-05-04 00:00:38tim.peterssetrecipients: + tim.peters, kleshni
2020-05-04 00:00:38tim.peterssetmessageid: <1588550438.45.0.0121375131274.issue40480@roundup.psfhosted.org>
2020-05-04 00:00:38tim.peterslinkissue40480 messages
2020-05-04 00:00:38tim.peterscreate