Author tim.peters
Recipients Hendrik.Lemelson, ezio.melotti, mrabarnett, pitrou, tim.peters
Date 2013-02-20.18:29:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1361384948.44.0.940813275912.issue17257@psf.upfronthosting.co.za>
In-reply-to
Content
This is how it's supposed to work:  Python's re matches at the leftmost position possible, and _then_ matches the longest possible substring at that position.  When a regexp _can_ match 0 characters, it will match starting at index 0.  So, e.g.,

>>> re.search('(a*)', 'caaaat').span()
(0, 0)

shows that the regexp matches the empty slice 'caaaat'[0:0] (the leftmost position at which it _can_ match), and

>>> re.search('(a(b+)a){0,1}', 'caabbaat').span()
(0, 0)

shows the same.  The groups didn't match anything in this case, because the outer {0,1} said "it's OK if you can't match anything".  Put another group around it:

>>> re.search('((a(b+)a){0,1})', 'caabbaat').groups()
('', None, None)

to see that the regexp as a whole did match the empty string.
History
Date User Action Args
2013-02-20 18:29:08tim.peterssetrecipients: + tim.peters, pitrou, ezio.melotti, mrabarnett, Hendrik.Lemelson
2013-02-20 18:29:08tim.peterssetmessageid: <1361384948.44.0.940813275912.issue17257@psf.upfronthosting.co.za>
2013-02-20 18:29:08tim.peterslinkissue17257 messages
2013-02-20 18:29:08tim.peterscreate