Message 378582 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	tim.peters
Recipients	Dennis Sweeney, Zeturic, ammar2, corona10, josh.r, pmpp, serhiy.storchaka, tim.peters, vstinner
Date	2020-10-13.22:03:35
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1602626615.95.0.701668883293.issue41972@roundup.psfhosted.org>
In-reply-to

Content
Dennis, would it be possible to isolate some of the cases with more extreme results and run them repeatedly under the same timing framework, as a test of how trustworthy the _framework_ is? From decades of bitter experience, most benchmarking efforts end up chasing ghosts ;-) For example, this result: length=3442, value=ASXABCDHAB... \| 289 us \| 2.36 ms: 8.19x slower (+719%) Is that real, or an illusion? Since the alphabet has only 26 letters, it's all but certain that a needle that long has more than one instance of every letter. So the status quo's "Bloom filter" will have every relevant bit set, rendering its _most_ effective speedup trick useless. That makes it hard (but not impossible) to imagine how it ends up being so much faster than a method with more powerful analysis to exploit.

Dennis, would it be possible to isolate some of the cases with more extreme results and run them repeatedly under the same timing framework, as a test of how trustworthy the _framework_ is? From decades of bitter experience, most benchmarking efforts end up chasing ghosts ;-)

For example, this result:

length=3442, value=ASXABCDHAB...  | 289 us  | 2.36 ms: 8.19x slower (+719%) 

Is that real, or an illusion?

Since the alphabet has only 26 letters, it's all but certain that a needle that long has more than one instance of every letter. So the status quo's "Bloom filter" will have every relevant bit set, rendering its _most_ effective speedup trick useless. That makes it hard (but not impossible) to imagine how it ends up being so much faster than a method with more powerful analysis to exploit.

History
Date	User	Action	Args
2020-10-13 22:03:35	tim.peters	set	recipients: + tim.peters, vstinner, pmpp, serhiy.storchaka, josh.r, ammar2, corona10, Dennis Sweeney, Zeturic
2020-10-13 22:03:35	tim.peters	set	messageid: <1602626615.95.0.701668883293.issue41972@roundup.psfhosted.org>
2020-10-13 22:03:35	tim.peters	link	issue41972 messages
2020-10-13 22:03:35	tim.peters	create