Author Jim.Jewett
Recipients Arach, Arfrever, Huzaifa.Sidhpurwala, Jim.Jewett, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, eric.araujo, eric.snow, fx5, georg.brandl, grahamd, gvanrossum, gz, jcea, lemburg, mark.dickinson, neologix, pitrou, skrah, terry.reedy, tim.peters, v+python, vstinner, zbysz
Date 2012-01-17.19:59:50
SpamBayes Score 2.45831e-12
Marked as misclassified No
Message-id <1326830391.84.0.596881406431.issue13703@psf.upfronthosting.co.za>
In-reply-to
Content
To be more explicit about Martin A. Lemburg's msg151121 (which I agree with):

Count the collisions on a single lookup. 
If they exceed a threshhold, do something different.

Martin's strawman proposal was threshhold=1000, and raise.  It would be just as easy to say "whoa!  5 collisions -- time to use the alternative hash instead" (and, possibly, to issue a warning).  

Even that slight tuning removes the biggest objection, because it won't ever actually fail.

Note that the use of a (presumably stronger 2nd) hash wouldn't come into play until (and unless) there was a problem for that specific key in that specific dictionary.  For the normal case, nothing changes -- unless we take advantage of the existence of a 2nd hash to simplify the first few rounds of collision resolution.  (Linear probing is more cache-friendly, but also more vulnerable to worst-case behavior -- but if probing stops at 4 or 8, that may not matter much.)  For quick scripts, the 2nd hash will almost certainly never be needed, so startup won't pay the penalty.

The only down side I see is that the 2nd (presumably randomized) hash won't be cached without another slot, which takes more memory and shouldn't be done in a bugfix release.
History
Date User Action Args
2012-01-17 19:59:52Jim.Jewettsetrecipients: + Jim.Jewett, lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, jcea, mark.dickinson, pitrou, vstinner, christian.heimes, benjamin.peterson, eric.araujo, grahamd, Arfrever, v+python, alex, zbysz, skrah, dmalcolm, gz, neologix, Arach, Mark.Shannon, eric.snow, Zhiping.Deng, Huzaifa.Sidhpurwala, PaulMcMillan, fx5
2012-01-17 19:59:51Jim.Jewettsetmessageid: <1326830391.84.0.596881406431.issue13703@psf.upfronthosting.co.za>
2012-01-17 19:59:51Jim.Jewettlinkissue13703 messages
2012-01-17 19:59:50Jim.Jewettcreate