Message 296156 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	tim.peters
Recipients	Dmitry Rubanovich, methane, rhettinger, serhiy.storchaka, tim.peters, xiang.zhang
Date	2017-06-16.03:21:02
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1497583263.11.0.661237672035.issue30671@psf.upfronthosting.co.za>
In-reply-to

Content
Whatever the iteration, accept that it's necessary it reach every value in range(2*i) eventually. The current scheme does that, for reasons already documented. You seem to be overlooking the importance of this part: """ Note that because perturb is unsigned, if the recurrence is executed often enough perturb eventually becomes and remains 0. At that point (very rarely reached) the recurrence is on (just) 5j+1 again, and that's certain to find an empty slot eventually (since it generates every int in range(2*i), and we make sure there's always at least one empty slot). """ 5 is the smallest multiplier (besides the degenerate 1) for which that's true for every power-of-2 modulus. It doesn't matter how rare collisions are if you switch to a scheme for which it's _not_ always true: then you risk an infinite loop failing to find an empty slot. Also note that testing was not just done on "random" cases, but "on both normal and pathological cases". For example, integer keys all of which have lots of trailing zero bits (there's a specific example of that too in the comments). The smaller PERTURB_SHIFT, the longer it takes to get the high-order bits into play - and the longer it takes to fall back to a pure "5j+1" iteration, which is the part that's necessary for _correctness_.

Whatever the iteration, accept that it's necessary it reach every value in range(2**i) eventually.  The current scheme does that, for reasons already documented.  You seem to be overlooking the importance of this part:

"""
Note that because perturb is unsigned, if the recurrence is executed often enough perturb eventually becomes and remains 0.  At that point (very rarely reached) the recurrence is on (just) 5*j+1 again, and that's certain to find an empty slot eventually (since it generates every int in range(2**i), and we make sure there's always at least one empty slot).
"""

5 is the smallest multiplier (besides the degenerate 1) for which that's true for every power-of-2 modulus.  It doesn't matter how rare collisions are if you switch to a scheme for which it's _not_ always true:  then you risk an infinite loop failing to find an empty slot.

Also note that testing was not just done on "random" cases, but "on both normal and pathological cases".  For example, integer keys all of which have lots of trailing zero bits (there's a specific example of that too in the comments).  The smaller PERTURB_SHIFT, the longer it takes to get the high-order bits into play - and the longer it takes to fall back to a pure "5*j+1" iteration, which is the part that's necessary for _correctness_.

History
Date	User	Action	Args
2017-06-16 03:21:03	tim.peters	set	recipients: + tim.peters, rhettinger, methane, serhiy.storchaka, xiang.zhang, Dmitry Rubanovich
2017-06-16 03:21:03	tim.peters	set	messageid: <1497583263.11.0.661237672035.issue30671@psf.upfronthosting.co.za>
2017-06-16 03:21:03	tim.peters	link	issue30671 messages
2017-06-16 03:21:02	tim.peters	create