Message378233
Indeed, this is just a very unlucky case.
>>> n = len(longer)
>>> from collections import Counter
>>> Counter(s[:n])
Counter({0: 9056995, 255: 6346813})
>>> s[n-30:n+30].replace(b'\x00', b'.').replace(b'\xff', b'@')
b'..............................@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@'
>>> Counter(s[n:])
Counter({255: 18150624})
When checking "base", we're in this situation
pattern: @@@@@@@@
string: .........@@@@@@@@
Algorithm says: ^ these last characters don't match.
^ this next character is not in the pattern
Therefore, skip ahead a bunch:
pattern: @@@@@@@@
string: .........@@@@@@@@
This is a match!
Whereas when checking "longer", we're in this situation:
pattern: @@@@@@@@@
string: .........@@@@@@@@
Algorithm says: ^ these last characters don't match.
^ this next character *is* in the pattern.
We can't jump forward.
pattern: @@@@@@@@
string: .........@@@@@@@@
Start comparing at every single alignment...
I'm attaching reproducer.py, which replicates this from scratch without loading data from a file. |
|
Date |
User |
Action |
Args |
2020-10-08 11:06:59 | Dennis Sweeney | set | recipients:
+ Dennis Sweeney, tim.peters, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, Zeturic |
2020-10-08 11:06:59 | Dennis Sweeney | set | messageid: <1602155219.38.0.293310549875.issue41972@roundup.psfhosted.org> |
2020-10-08 11:06:59 | Dennis Sweeney | link | issue41972 messages |
2020-10-08 11:06:59 | Dennis Sweeney | create | |
|