Author Ma Lin
Recipients Ma Lin, davisjam, effbot, ezio.melotti, mrabarnett, rhettinger, serhiy.storchaka, tim.peters
Date 2019-02-09.11:39:34
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1549712374.8.0.845069487526.issue35859@roundup.psfhosted.org>
In-reply-to
Content
For a capture group, state->mark[] array stores it's begin and end:
begin: state->mark[(group_number-1)*2]
end:   state->mark[(group_number-1)*2+1]

So state->mark[0] is the begin of the first capture group.
state->mark[1] is the end of the first capture group.

re.search(r'(ab|a)*?b', 'ab')
In this case, here is a simplified actions record:

01  MARK 0
02  "a":  first "a" in the pattern [SUCCESS]
03  BRANCH
04    "b": first "b" in the pattern [SUCCESS]
05    MARK 1
06    "b": second "b" in the pattern [FAIL]
07    try next (ab|a)*? [FAIL]
08      MARK 0
09      "a":  first "a" in the pattern [FAIL]
10  BRANCH: try next branch
11    "": the second branch [SUCCESS]
12    MARK 1
13    "b" [SUCCESS]: second "b" in the pattern

MARK_PUSH(lastmark) macro didn't protect MARK-0 if it was the only available mark, while the BRANCH op uses this macro to protect capture groups before trying a branch.

So capture group 1 is [MARK-0 at Line-08, MARK-1 at line-12), this is wrong. 
The correct capture group 1 should be [MARK-0 at Line-01, MARK-1 at line-12).
History
Date User Action Args
2019-02-09 11:39:35Ma Linsetrecipients: + Ma Lin, tim.peters, effbot, rhettinger, ezio.melotti, mrabarnett, serhiy.storchaka, davisjam
2019-02-09 11:39:34Ma Linsetmessageid: <1549712374.8.0.845069487526.issue35859@roundup.psfhosted.org>
2019-02-09 11:39:34Ma Linlinkissue35859 messages
2019-02-09 11:39:34Ma Lincreate