Author tim.peters
Recipients Dennis Sweeney, Zeturic, ammar2, corona10, gregory.p.smith, gvanrossum, josh.r, pmpp, serhiy.storchaka, tim.peters, vstinner
Date 2020-10-18.02:09:48
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1602986989.1.0.366063804071.issue41972@roundup.psfhosted.org>
In-reply-to
Content
Yup, they act essentially the same, but yours jumps into the quicksand earlier ;-)

I'm fond of this one:

"""
HUGE = 10**7
BIG = 10**6

bigxs = 'x' * BIG

haystack = 'x' * HUGE
needle = bigxs + 'y' + bigxs
"""

"The problem" then is in middle of the needle, not at either end, so really simple tricks all fail. For example, set(haystack) is a subset of set(needle), so our "Bloom filter" is useless, and needle[-1] == needle[-2], so our "skip" count is the useless 0.  Pure brute force is quadratic-time, and we're even slower because we're paying over & over to try heuristic speedups that happen never to pay off.

Two-way splits the needle as

u = bigxs
v = 'y' + bigxs

and again never gets out of the "try to match the right part" phase. Each time around it fails to match 'y' on the first try, and shifts right by 1.  Boring, but linear time overall.
History
Date User Action Args
2020-10-18 02:09:49tim.peterssetrecipients: + tim.peters, gvanrossum, gregory.p.smith, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, corona10, Dennis Sweeney, Zeturic
2020-10-18 02:09:49tim.peterssetmessageid: <1602986989.1.0.366063804071.issue41972@roundup.psfhosted.org>
2020-10-18 02:09:49tim.peterslinkissue41972 messages
2020-10-18 02:09:48tim.peterscreate