This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mrabarnett
Recipients LewisGaul, abacabadabacaba, ezio.melotti, mrabarnett, serhiy.storchaka
Date 2019-10-27.18:35:59
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1572201360.12.0.222969488777.issue23692@roundup.psfhosted.org>
In-reply-to
Content
Suppose you had a pattern:

    .*

It would advance one character on each iteration of the * until the . failed to match. The text is finite, so it would stop matching eventually.

Now suppose you had a pattern:

    (?:)*

On each iteration of the * it wouldn't advance, so it would keep matching forever.

A way to avoid that is to stop the * if it hasn't advanced.

The example pattern shows that there's still a problem. It advances if a group has matched, but that group doens't match until the first iteration, after the test, and does not, itself, advance. The * stops because it hasn't advanced, but, in this instance, that doesn't mean it never will.

The solution is for the * to check not only whether it has advanced, but also whether a group has changed. (Strictly speaking, the latter check is needed only if the repeated part tests whether a group also in the repeated part has changed, but it's probably not worth "optimising" for that possibility.)

In the regex module, it increments a "capture changed" counter whenever any group is changed (a group's first match or a change to a group's span). That makes it easier for the * to check. The code needs to save that counter for backtracking and restore it when backtracking.

I've mentioned only the *, but the same remarks apply to + and {...}, except that the {...} should keep repeating until it has reached its prescribed minimum.
History
Date User Action Args
2019-10-27 18:36:00mrabarnettsetrecipients: + mrabarnett, ezio.melotti, abacabadabacaba, serhiy.storchaka, LewisGaul
2019-10-27 18:36:00mrabarnettsetmessageid: <1572201360.12.0.222969488777.issue23692@roundup.psfhosted.org>
2019-10-27 18:36:00mrabarnettlinkissue23692 messages
2019-10-27 18:35:59mrabarnettcreate