Message 379494 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Dennis Sweeney
Recipients	Dennis Sweeney, Zeturic, ammar2, corona10, gregory.p.smith, gvanrossum, josh.r, pmpp, serhiy.storchaka, tim.peters, vstinner
Date	2020-10-23.23:14:12
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1603494853.14.0.842614805417.issue41972@roundup.psfhosted.org>
In-reply-to

Content
Here are those zipf-distributed benchmarks for PR 22904: https://pastebin.com/raw/qBaMi2dm Ignoring differences of <5%, there are 33 cases that get slower, but 477 cases that got faster. Here's a stringbench.py run for PR 22904: https://pastebin.com/raw/ABm32bA0 It looks like the stringbench times get a bit worse on a few cases, but I would attribute that to the benchmarks having many "difficult" cases with a unique character at the end of the needle, such as: s="ABC"33; ((s+"D")500+s+"E").find(s+"E"), which the status quo already handles as well as possible, whereas the PR best handles the case where some middle "cut" character is unique. Who knows how common these cases are.

Here are those zipf-distributed benchmarks for PR 22904: https://pastebin.com/raw/qBaMi2dm

Ignoring differences of <5%, there are 33 cases that get slower, but 477 cases that got faster.

Here's a stringbench.py run for PR 22904: https://pastebin.com/raw/ABm32bA0

It looks like the stringbench times get a bit worse on a few cases, but I would attribute that to the benchmarks having many "difficult" cases with a unique character at the end of the needle, such as:

    s="ABC"*33; ((s+"D")*500+s+"E").find(s+"E"),

which the status quo already handles as well as possible, whereas the PR best handles the case where some middle "cut" character is unique. Who knows how common these cases are.

History
Date	User	Action	Args
2020-10-23 23:14:13	Dennis Sweeney	set	recipients: + Dennis Sweeney, gvanrossum, tim.peters, gregory.p.smith, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, corona10, Zeturic
2020-10-23 23:14:13	Dennis Sweeney	set	messageid: <1603494853.14.0.842614805417.issue41972@roundup.psfhosted.org>
2020-10-23 23:14:13	Dennis Sweeney	link	issue41972 messages
2020-10-23 23:14:12	Dennis Sweeney	create