Message151484
To be more explicit about Martin A. Lemburg's msg151121 (which I agree with):
Count the collisions on a single lookup.
If they exceed a threshhold, do something different.
Martin's strawman proposal was threshhold=1000, and raise. It would be just as easy to say "whoa! 5 collisions -- time to use the alternative hash instead" (and, possibly, to issue a warning).
Even that slight tuning removes the biggest objection, because it won't ever actually fail.
Note that the use of a (presumably stronger 2nd) hash wouldn't come into play until (and unless) there was a problem for that specific key in that specific dictionary. For the normal case, nothing changes -- unless we take advantage of the existence of a 2nd hash to simplify the first few rounds of collision resolution. (Linear probing is more cache-friendly, but also more vulnerable to worst-case behavior -- but if probing stops at 4 or 8, that may not matter much.) For quick scripts, the 2nd hash will almost certainly never be needed, so startup won't pay the penalty.
The only down side I see is that the 2nd (presumably randomized) hash won't be cached without another slot, which takes more memory and shouldn't be done in a bugfix release. |
|
Date |
User |
Action |
Args |
2012-01-17 19:59:52 | Jim.Jewett | set | recipients:
+ Jim.Jewett, lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, jcea, mark.dickinson, pitrou, vstinner, christian.heimes, benjamin.peterson, eric.araujo, grahamd, Arfrever, v+python, alex, zbysz, skrah, dmalcolm, gz, neologix, Arach, Mark.Shannon, eric.snow, Zhiping.Deng, Huzaifa.Sidhpurwala, PaulMcMillan, fx5 |
2012-01-17 19:59:51 | Jim.Jewett | set | messageid: <1326830391.84.0.596881406431.issue13703@psf.upfronthosting.co.za> |
2012-01-17 19:59:51 | Jim.Jewett | link | issue13703 messages |
2012-01-17 19:59:50 | Jim.Jewett | create | |
|