This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tim.peters
Recipients Dennis Sweeney, Zeturic, ammar2, corona10, josh.r, pmpp, serhiy.storchaka, tim.peters, vstinner
Date 2020-10-13.22:03:35
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1602626615.95.0.701668883293.issue41972@roundup.psfhosted.org>
In-reply-to
Content
Dennis, would it be possible to isolate some of the cases with more extreme results and run them repeatedly under the same timing framework, as a test of how trustworthy the _framework_ is? From decades of bitter experience, most benchmarking efforts end up chasing ghosts ;-)

For example, this result:

length=3442, value=ASXABCDHAB...  | 289 us  | 2.36 ms: 8.19x slower (+719%) 

Is that real, or an illusion?

Since the alphabet has only 26 letters, it's all but certain that a needle that long has more than one instance of every letter. So the status quo's "Bloom filter" will have every relevant bit set, rendering its _most_ effective speedup trick useless. That makes it hard (but not impossible) to imagine how it ends up being so much faster than a method with more powerful analysis to exploit.
History
Date User Action Args
2020-10-13 22:03:35tim.peterssetrecipients: + tim.peters, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, corona10, Dennis Sweeney, Zeturic
2020-10-13 22:03:35tim.peterssetmessageid: <1602626615.95.0.701668883293.issue41972@roundup.psfhosted.org>
2020-10-13 22:03:35tim.peterslinkissue41972 messages
2020-10-13 22:03:35tim.peterscreate