Author malin
Recipients davisjam, effbot, ezio.melotti, malin, mrabarnett, rhettinger, serhiy.storchaka, tim.peters
Date 2019-03-04.09:59:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1551693591.03.0.481765778361.issue35859@roundup.psfhosted.org>
In-reply-to
Content
Found another bug in re:

>>> re.match(r'(?:.*?\b(?=(\t)|(x))x)*', 'a\txa\tx').groups()
('\t', 'x')

Expected result: (None, 'x')

PHP 7.3.2           NULL, "x"
Java 11.0.2         "\t", "x"
Perl 5.28.1         "\t", "x"
Ruby 2.6.1          nil, "x"
Go 1.12             doesn't support lookaround
Rust 1.32.0         doesn't support lookaround
Node.js 10.15.1     undefined, "x"
regex 2019.2.21     None, "x"
re                  "\t", "x"

This is a very rare bug, can be fixed by adding MARH_PUSH() before JUMP_MIN_REPEAT_ONE. And maybe other JUMPs should MARK_PUSH() as well.

I'm impressed with regex module, it never went wrong.
IMHO, I would like to see a pruned version be adopted into stdlib.

~~~~~~~~~~~~~~~~~~~~~~
> Interesting sidelights 1
> Found a Perl bug

I reported to Perl, it's a bug in perl-5.26, and already fixed in perl-5.28.0.
History
Date User Action Args
2019-03-04 09:59:51malinsetrecipients: + malin, tim.peters, effbot, rhettinger, ezio.melotti, mrabarnett, serhiy.storchaka, davisjam
2019-03-04 09:59:51malinsetmessageid: <1551693591.03.0.481765778361.issue35859@roundup.psfhosted.org>
2019-03-04 09:59:51malinlinkissue35859 messages
2019-03-04 09:59:50malincreate