Author tim.peters
Recipients Dmitry Rubanovich, methane, rhettinger, serhiy.storchaka, tim.peters, xiang.zhang
Date 2017-06-16.03:21:02
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1497583263.11.0.661237672035.issue30671@psf.upfronthosting.co.za>
In-reply-to
Content
Whatever the iteration, accept that it's necessary it reach every value in range(2**i) eventually.  The current scheme does that, for reasons already documented.  You seem to be overlooking the importance of this part:

"""
Note that because perturb is unsigned, if the recurrence is executed often enough perturb eventually becomes and remains 0.  At that point (very rarely reached) the recurrence is on (just) 5*j+1 again, and that's certain to find an empty slot eventually (since it generates every int in range(2**i), and we make sure there's always at least one empty slot).
"""

5 is the smallest multiplier (besides the degenerate 1) for which that's true for every power-of-2 modulus.  It doesn't matter how rare collisions are if you switch to a scheme for which it's _not_ always true:  then you risk an infinite loop failing to find an empty slot.

Also note that testing was not just done on "random" cases, but "on both normal and pathological cases".  For example, integer keys all of which have lots of trailing zero bits (there's a specific example of that too in the comments).  The smaller PERTURB_SHIFT, the longer it takes to get the high-order bits into play - and the longer it takes to fall back to a pure "5*j+1" iteration, which is the part that's necessary for _correctness_.
History
Date User Action Args
2017-06-16 03:21:03tim.peterssetrecipients: + tim.peters, rhettinger, methane, serhiy.storchaka, xiang.zhang, Dmitry Rubanovich
2017-06-16 03:21:03tim.peterssetmessageid: <1497583263.11.0.661237672035.issue30671@psf.upfronthosting.co.za>
2017-06-16 03:21:03tim.peterslinkissue30671 messages
2017-06-16 03:21:02tim.peterscreate