Author mrabarnett
Recipients Alcolo Alcolo, ezio.melotti, martin.panter, mrabarnett, r.david.murray, serhiy.storchaka
Date 2017-12-02.21:29:18
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1512250158.59.0.213398074469.issue25054@psf.upfronthosting.co.za>
In-reply-to
Content
The pattern:

    \b|:+

will match a word boundary (zero-width) before colons, so if there's a word followed by colons, finditer will find the boundary and then the colons. You _can_ get a zero-width match (ZWM) joined to the start of a nonzero-width match (NWM). That's not really surprising.

If you wanted to avoid a ZWM joined to either end of a NWM, you'd need to keep looking for another match at a position even after you'd already found a match if what you'd found was zero-width. That would also affect re.search and re.match.

For regex on Python 3.7, I'm going with avoiding a ZWM joined to the end of a NWM, unless re's going a different way, in which case I have more work to do to remain compatible! The change I did for Python 3.7+ was trivial.
History
Date User Action Args
2017-12-02 21:29:18mrabarnettsetrecipients: + mrabarnett, ezio.melotti, r.david.murray, martin.panter, serhiy.storchaka, Alcolo Alcolo
2017-12-02 21:29:18mrabarnettsetmessageid: <1512250158.59.0.213398074469.issue25054@psf.upfronthosting.co.za>
2017-12-02 21:29:18mrabarnettlinkissue25054 messages
2017-12-02 21:29:18mrabarnettcreate