Author steven.daprano
Recipients steve.newcomb, steven.daprano
Date 2018-12-16.00:58:44
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1544921926.07.0.788709270274.issue35496@psf.upfronthosting.co.za>
In-reply-to
Content
> See attached script, which is self-explanatory.

I'm glad one of us thinks so, because I find it clear as mud.

I spent *way* longer on this than I should have, but I simplified your sample code to the best of my ability. (See attached.) As far as I can tell, your code and mine does roughly the same thing, but please check that you agree.

I agree that with the IPV6 portion of the regex removed, it matches on "208.123.4.22", but with the IPV6 portion included, it matches on "::ffff:208.123.4.22". But I'm not sure that's a bug. I think it is working as designed. For example:


py> import re
py> text = 'green pepper'
py> re.search('pepper|green pepper', text).group(0)
'green pepper'


seems to be analogous to your example, but simpler. Do you agree? If not, it would also help a lot if you could find a simpler regex that demonstrates the issue. See http://www.sscce.org/

In your case, I believe that the rightmost alternative matches from position 1 of the text, while the leftmost alternative doesn't match until position 8. So starting from position 0, the IPV6 check matches first, and so wins.

It is possible you were expecting that the IPV4 check would be tested against position 0, then position 1, then position 2, then ... and so on until the end of the string, and only then the IPV6 check tested against position 0, then 1 etc.
History
Date User Action Args
2018-12-16 00:58:46steven.dapranosetrecipients: + steven.daprano, steve.newcomb
2018-12-16 00:58:46steven.dapranosetmessageid: <1544921926.07.0.788709270274.issue35496@psf.upfronthosting.co.za>
2018-12-16 00:58:46steven.dapranolinkissue35496 messages
2018-12-16 00:58:45steven.dapranocreate