Author Ma Lin
Recipients Ma Lin, davisjam, effbot, ezio.melotti, mrabarnett, rhettinger, serhiy.storchaka, tim.peters
Date 2019-03-04.09:59:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1551693591.03.0.481765778361.issue35859@roundup.psfhosted.org>
In-reply-to
Content
Found another bug in re:

>>> re.match(r'(?:.*?\b(?=(\t)|(x))x)*', 'a\txa\tx').groups()
('\t', 'x')

Expected result: (None, 'x')

PHP 7.3.2           NULL, "x"
Java 11.0.2         "\t", "x"
Perl 5.28.1         "\t", "x"
Ruby 2.6.1          nil, "x"
Go 1.12             doesn't support lookaround
Rust 1.32.0         doesn't support lookaround
Node.js 10.15.1     undefined, "x"
regex 2019.2.21     None, "x"
re                  "\t", "x"

This is a very rare bug, can be fixed by adding MARH_PUSH() before JUMP_MIN_REPEAT_ONE. And maybe other JUMPs should MARK_PUSH() as well.

I'm impressed with regex module, it never went wrong.
IMHO, I would like to see a pruned version be adopted into stdlib.

~~~~~~~~~~~~~~~~~~~~~~
> Interesting sidelights 1
> Found a Perl bug

I reported to Perl, it's a bug in perl-5.26, and already fixed in perl-5.28.0.
History
Date User Action Args
2019-03-04 09:59:51Ma Linsetrecipients: + Ma Lin, tim.peters, effbot, rhettinger, ezio.melotti, mrabarnett, serhiy.storchaka, davisjam
2019-03-04 09:59:51Ma Linsetmessageid: <1551693591.03.0.481765778361.issue35859@roundup.psfhosted.org>
2019-03-04 09:59:51Ma Linlinkissue35859 messages
2019-03-04 09:59:50Ma Lincreate