Author lemburg
Recipients Arach, Arfrever, Huzaifa.Sidhpurwala, Jim.Jewett, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, eric.snow, fx5, georg.brandl, grahamd, gregory.p.smith, gvanrossum, gz, haypo, jcea, lemburg, mark.dickinson, merwok, neologix, pitrou, skrah, terry.reedy, tim.peters, v+python, zbysz
Date 2012-01-23.21:39:31
SpamBayes Score 7.2568e-06
Marked as misclassified No
Message-id <4F1DD38E.7000605@egenix.com>
In-reply-to <4F1D8E27.3000707@egenix.com>
Content
> To see the collision counting, enable the DEBUG_DICT_COLLISIONS
> macro variable.

Running (part of (*)) the test suite with debugging enabled on a 64-bit
machine shows that slot collisions are much more frequent than
hash collisions, which only account for less than 0.01% of all
collisions.

It also shows that slot collisions in the low 1-10 range are
most frequent, with very few instances of a dict lookup
reaching 20 slot collisions (less than 0.0002% of all
collisions).

The great number of cases with 1 or 2 slot collisions surprised
me. It seems that there's potential for improvement of
the perturbation formula left.

Due to the large number of 1 or 2 slot collisions, the patch
is going to cause a minor hit to dict lookup performance.
It may make sense to unroll the slot search loop and only
start counting after the third round of misses.

(*) I stopped the run after several hours run-time, producing
some 148GB log data.
History
Date User Action Args
2012-01-23 21:39:32lemburgsetrecipients: + lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, gregory.p.smith, jcea, mark.dickinson, pitrou, haypo, christian.heimes, benjamin.peterson, merwok, grahamd, Arfrever, v+python, alex, zbysz, skrah, dmalcolm, gz, neologix, Arach, Mark.Shannon, eric.snow, Zhiping.Deng, Huzaifa.Sidhpurwala, Jim.Jewett, PaulMcMillan, fx5
2012-01-23 21:39:31lemburglinkissue13703 messages
2012-01-23 21:39:31lemburgcreate