This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Jim.Jewett
Recipients Arach, Arfrever, Huzaifa.Sidhpurwala, Jim.Jewett, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, eric.araujo, eric.snow, fx5, georg.brandl, grahamd, gvanrossum, gz, jcea, lemburg, mark.dickinson, neologix, pitrou, skrah, terry.reedy, tim.peters, v+python, vstinner, zbysz
Date 2012-01-17.19:59:50
SpamBayes Score 2.4583113e-12
Marked as misclassified No
Message-id <1326830391.84.0.596881406431.issue13703@psf.upfronthosting.co.za>
In-reply-to
Content
To be more explicit about Martin A. Lemburg's msg151121 (which I agree with):

Count the collisions on a single lookup. 
If they exceed a threshhold, do something different.

Martin's strawman proposal was threshhold=1000, and raise.  It would be just as easy to say "whoa!  5 collisions -- time to use the alternative hash instead" (and, possibly, to issue a warning).  

Even that slight tuning removes the biggest objection, because it won't ever actually fail.

Note that the use of a (presumably stronger 2nd) hash wouldn't come into play until (and unless) there was a problem for that specific key in that specific dictionary.  For the normal case, nothing changes -- unless we take advantage of the existence of a 2nd hash to simplify the first few rounds of collision resolution.  (Linear probing is more cache-friendly, but also more vulnerable to worst-case behavior -- but if probing stops at 4 or 8, that may not matter much.)  For quick scripts, the 2nd hash will almost certainly never be needed, so startup won't pay the penalty.

The only down side I see is that the 2nd (presumably randomized) hash won't be cached without another slot, which takes more memory and shouldn't be done in a bugfix release.
History
Date User Action Args
2012-01-17 19:59:52Jim.Jewettsetrecipients: + Jim.Jewett, lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, jcea, mark.dickinson, pitrou, vstinner, christian.heimes, benjamin.peterson, eric.araujo, grahamd, Arfrever, v+python, alex, zbysz, skrah, dmalcolm, gz, neologix, Arach, Mark.Shannon, eric.snow, Zhiping.Deng, Huzaifa.Sidhpurwala, PaulMcMillan, fx5
2012-01-17 19:59:51Jim.Jewettsetmessageid: <1326830391.84.0.596881406431.issue13703@psf.upfronthosting.co.za>
2012-01-17 19:59:51Jim.Jewettlinkissue13703 messages
2012-01-17 19:59:50Jim.Jewettcreate