This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Dennis Sweeney
Recipients Dennis Sweeney, Zeturic, ammar2, josh.r, pmpp, serhiy.storchaka, tim.peters, vstinner
Date 2020-10-10.08:25:19
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
I agree that skip could could do 1 better.


> I don't know whether it "should be" applied

I don't think I'm convinced: the second check fixes only the very specific case when s[len(p):].startswith(p).
Perturbations of are still very slow with the patch:

    - pattern2 = b"B" * BIG
    + pattern2 = b"B" * (BIG + 10)

In fact, it's even slower than before due to the double-checking.


I'd be curious how something like Crochemore and Perrin's "two-way" algorithm would do (constant space, compares < 2n-m):

Apparently it's been used as glibc's memmem() since 2008. There, it additionally computes a table, but only for long needles where it really pays off:

Although there certainly seems to be a complexity rabbithole here that would be easy to over-do.
I might look more into this.
Date User Action Args
2020-10-10 08:25:19Dennis Sweeneysetrecipients: + Dennis Sweeney, tim.peters, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, Zeturic
2020-10-10 08:25:19Dennis Sweeneysetmessageid: <>
2020-10-10 08:25:19Dennis Sweeneylinkissue41972 messages
2020-10-10 08:25:19Dennis Sweeneycreate