This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Dennis Sweeney
Recipients Dennis Sweeney, Zeturic, ammar2, corona10, gregory.p.smith, gvanrossum, josh.r, pmpp, serhiy.storchaka, tim.peters, vstinner
Date 2020-10-19.01:24:36
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1603070676.89.0.930095679528.issue41972@roundup.psfhosted.org>
In-reply-to
Content
Below is one of the tests that got run when I happened to import something, and I thought it was a good illustration of the Boyer-Moore bad character shift table.

It's worth noting in particular that the table is the dominant force for speed in some common cases, with the two-way stuff only ever being checked once in this example. The shift table can be defeated with pathological strings, and that's where the two-way stuff begins to shine.

Checking " 32 bit (ARM)" in "3.10.0a1+ (heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]".
========================
Two-way with needle=" 32 bit (ARM)" and haystack="(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
Split " 32 bit (ARM)" into " 32 bit" and " (ARM)".
needle is NOT completely periodic.
Using period 8.
> "(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
> " 32 bit (ARM)"
Last character not found in string.
> "(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
>              " 32 bit (ARM)"
Table says shift by 11.
> "(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
>                         " 32 bit (ARM)" # Made the '3's line up
Table says shift by 5.
> "(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
>                              " 32 bit (ARM)" # Here made the spaces line up
Last character not found in string.
> "(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
>                                           " 32 bit (ARM)" # Made the spaces line up
Checking the right half.
No match.
Jump forward without checking left half.
> "(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
>                                            " 32 bit (ARM)" # Made the spaces line up
Table says shift by 5.
> "(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
>                                                 " 32 bit (ARM)" # Made the spaces line up
Table says shift by 5.
> "(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
>                                                      " 32 bit (ARM)" # Made the spaces line up
Table says shift by 10.
> "(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
>                                                                " 32 bit (ARM)" # Made the spaces line up
Table says shift by 4.
> "(heads/two-way-dirty:cf4e398e94, Oct 18 2020, 20:09:21) [MSC v.1927 32 bit (Intel)]"
>                                                                    " 32 bit (ARM)" # Made the lparens line up
Last character not found in string.
Reached end. Returning -1.
History
Date User Action Args
2020-10-19 01:24:37Dennis Sweeneysetrecipients: + Dennis Sweeney, gvanrossum, tim.peters, gregory.p.smith, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, corona10, Zeturic
2020-10-19 01:24:36Dennis Sweeneysetmessageid: <1603070676.89.0.930095679528.issue41972@roundup.psfhosted.org>
2020-10-19 01:24:36Dennis Sweeneylinkissue41972 messages
2020-10-19 01:24:36Dennis Sweeneycreate