This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tim.peters
Recipients Dennis Sweeney, Zeturic, ammar2, corona10, gregory.p.smith, gvanrossum, josh.r, pmpp, serhiy.storchaka, tim.peters, vstinner
Date 2020-10-19.03:43:38
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1603079018.76.0.0483347254274.issue41972@roundup.psfhosted.org>
In-reply-to
Content
Dennis, I think that's expected, right? Two-way on its own can exploit nothing about individual characters - it only preprocesses the needle to break the possibility for quadratic-time behavior due to periods in the needle.

It sounds like you switched the current code's Sunday-ish approach to a more Boyer-Moore-ish approach. That probably hurt, and especially for shorter needles (where Sunday can yield a maximum shift of len(needle)+1, not len(needle) - the "+1" is more significant the smaller len(needle)).

Sunday gave 3 variants of his basic algorithm, and in his tests they were all faster than Boyer-Moore, and more so the shorter the needle.  Just FYI, here's a scanned PDF of his paper:

https://csclub.uwaterloo.ca/~pbarfuss/p132-sunday.pdf
History
Date User Action Args
2020-10-19 03:43:38tim.peterssetrecipients: + tim.peters, gvanrossum, gregory.p.smith, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, corona10, Dennis Sweeney, Zeturic
2020-10-19 03:43:38tim.peterssetmessageid: <1603079018.76.0.0483347254274.issue41972@roundup.psfhosted.org>
2020-10-19 03:43:38tim.peterslinkissue41972 messages
2020-10-19 03:43:38tim.peterscreate