Author serhiy.storchaka
Recipients Alcolo Alcolo, ezio.melotti, martin.panter, mrabarnett, r.david.murray, serhiy.storchaka
Date 2017-12-02.17:37:24
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1512236244.88.0.213398074469.issue25054@psf.upfronthosting.co.za>
In-reply-to
Content
Good point. Neither old nor new (which matches regex) behaviors conform the documentation: "Empty matches are included in the result unless they touch the beginning of another match." It is easy to exclude empty matches that touch the *ending* of another match. This would be consistent with the new behavior of split() and sub().

But this would break a one existing test for issue817234. Though that issue shouldn't rely on this detail. The test should just test that iterating doesn't hang.

And this would break a regular expression in pprint.

PR 4678 implements this version. I don't know what version is better.

>>> list(re.finditer(r"\b|:+", "a::bc"))
[<re.Match object; span=(0, 0), match=''>, <re.Match object; span=(1, 1), match=''>, <re.Match object; span=(1, 3), match='::'>, <re.Match object; span=(5, 5), match=''>]
>>> re.sub(r"(\b|:+)", r"[\1]", "a::bc")
'[]a[][::]bc[]'

With PR 4471 the result of re.sub() is the same, but the result of re.finditer() is as in msg307424.
History
Date User Action Args
2017-12-02 17:37:24serhiy.storchakasetrecipients: + serhiy.storchaka, ezio.melotti, mrabarnett, r.david.murray, martin.panter, Alcolo Alcolo
2017-12-02 17:37:24serhiy.storchakasetmessageid: <1512236244.88.0.213398074469.issue25054@psf.upfronthosting.co.za>
2017-12-02 17:37:24serhiy.storchakalinkissue25054 messages
2017-12-02 17:37:24serhiy.storchakacreate