This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients christian.heimes, rhettinger, serhiy.storchaka, vstinner
Date 2013-08-17.21:11:12
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1376773873.26.0.896329041842.issue18771@psf.upfronthosting.co.za>
In-reply-to
Content
Serhiy, I've posted a patch so you can examine the effects in detail.

For something like set(range(n)), I would expect no change because the dataset is collision free due to Tim's design where hash(someint)==someint.  That said, I'm trying to optimize the general case and don't really if rare cases are modestly slowed.

The new code only kicks in when there is a collision.  The logic for the initial probe is unchanged.   The idea is to pickup some of the cache locality benefits of collision resolution by separate chaining, but not incurring the 100% malloc typically associated with chaining.

When adjacent entries are in the same cache-line, the cost of the extra adjacent probe is much cheaper (nearly zero) than a probe somewhere else in memory.
History
Date User Action Args
2013-08-17 21:11:13rhettingersetrecipients: + rhettinger, vstinner, christian.heimes, serhiy.storchaka
2013-08-17 21:11:13rhettingersetmessageid: <1376773873.26.0.896329041842.issue18771@psf.upfronthosting.co.za>
2013-08-17 21:11:13rhettingerlinkissue18771 messages
2013-08-17 21:11:12rhettingercreate