This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Dennis Sweeney
Recipients Dennis Sweeney, Zeturic, ammar2, corona10, gregory.p.smith, gvanrossum, josh.r, pmpp, serhiy.storchaka, tim.peters, vstinner
Date 2020-10-23.23:14:12
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1603494853.14.0.842614805417.issue41972@roundup.psfhosted.org>
In-reply-to
Content
Here are those zipf-distributed benchmarks for PR 22904: https://pastebin.com/raw/qBaMi2dm

Ignoring differences of <5%, there are 33 cases that get slower, but 477 cases that got faster.

Here's a stringbench.py run for PR 22904: https://pastebin.com/raw/ABm32bA0

It looks like the stringbench times get a bit worse on a few cases, but I would attribute that to the benchmarks having many "difficult" cases with a unique character at the end of the needle, such as:

    s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E"),

which the status quo already handles as well as possible, whereas the PR best handles the case where some middle "cut" character is unique. Who knows how common these cases are.
History
Date User Action Args
2020-10-23 23:14:13Dennis Sweeneysetrecipients: + Dennis Sweeney, gvanrossum, tim.peters, gregory.p.smith, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, corona10, Zeturic
2020-10-23 23:14:13Dennis Sweeneysetmessageid: <1603494853.14.0.842614805417.issue41972@roundup.psfhosted.org>
2020-10-23 23:14:13Dennis Sweeneylinkissue41972 messages
2020-10-23 23:14:12Dennis Sweeneycreate