Yes, I do understand it.
But the statement in lines 166, 167: "For any initial j in range(2**i), repeating that 2**i times generates each int in range(2**i) exactly once" does not hold when the perturb is added. 2**i times will not be enough to generate all elements of the ring when some of multipliers are zero-divisors.
Specifically, if you use j=((5*j) + P + 1) mod 2**i, you are effectively multiplying by a zero divisors every time P = j mod 2. And I don't mean "mod 2**i." I do mean "mod 2." Which means anytime P (which changes a few times and eventually becomes 0), has the same parity as j, you are multiplying by a zero-divisor. Because P is eventually 0, you will go through all the values **eventually**. But for all values of P for which there is a P' such that P=/=P' and ((5*j) + P + 1) = ((5*j) + P' + 1) mod 2**i, the number of times you'll need to apply the function will be higher than 2**i. And that's in the worst case scenario.
In the best case scenario, the collision probability can be gathered from looking at the values of my script printed by print_collision_counts(py3_6_1_lookdict_perturb). It can be as low as ~1/20 and as high as ~1/3 depending on the value of i.
The main speed up we are seeking is to avoid a collision early on. And we are less concerned about collisions later on. Anecdotally, if your dict has 256 buckets, then the chance of collision is 1 in ~3.5. Which is an improvement over 1 in 2, but still pretty high.
Ok, how about this: to avoid the edge cases, unroll the 1st secondary hash key to use j = 2*j + P + 1. So try to test for it before the loop. But leave 5*j + P + 1 in the loop as is.
Although, to be honest if PERTURB_SHIFT is changed to 1 and we have a dict with a key that causes 64 collisions, this may be a good time to resize even if we are still sparse. On the other hand, this might create an attack vector with some magic value which causes resizes of near empty dict's. So maybe not... Certainly not without further analysis.
BTW, and it's mostly stylistic, but except for the "*hashpos = i & mask;" line in "find_empty_slot()", applying of the mask can be moved to "dk_get_index()". Again, I am looking at the released 3.6.1 code, so if this is already done, then never mind.
As another note, changing PERTURB_SHIFT to 1 causes a near-universal reduction in collisions (for all strategies). So if nothing else, that's probably an improvement. |