This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Dennis Sweeney
Recipients Dennis Sweeney, Zeturic, ammar2, josh.r, pmpp, serhiy.storchaka, tim.peters, vstinner
Date 2020-10-08.11:06:59
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
Indeed, this is just a very unlucky case.

    >>> n = len(longer)
    >>> from collections import Counter
    >>> Counter(s[:n])
    Counter({0: 9056995, 255: 6346813})
    >>> s[n-30:n+30].replace(b'\x00', b'.').replace(b'\xff', b'@')
    >>> Counter(s[n:])
    Counter({255: 18150624})

When checking "base", we're in this situation

    pattern:     @@@@@@@@
     string:     .........@@@@@@@@
    Algorithm says:     ^ these last characters don't match.
                         ^ this next character is not in the pattern
                         Therefore, skip ahead a bunch:

     pattern:              @@@@@@@@
      string:     .........@@@@@@@@

     This is a match!

Whereas when checking "longer", we're in this situation:

    pattern:     @@@@@@@@@
     string:     .........@@@@@@@@
    Algorithm says:      ^ these last characters don't match.
                          ^ this next character *is* in the pattern.
                          We can't jump forward.

     pattern:       @@@@@@@@
      string:     .........@@@@@@@@

     Start comparing at every single alignment...

I'm attaching, which replicates this from scratch without loading data from a file.
Date User Action Args
2020-10-08 11:06:59Dennis Sweeneysetrecipients: + Dennis Sweeney, tim.peters, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, Zeturic
2020-10-08 11:06:59Dennis Sweeneysetmessageid: <>
2020-10-08 11:06:59Dennis Sweeneylinkissue41972 messages
2020-10-08 11:06:59Dennis Sweeneycreate