This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author tim.peters
Recipients crusaderky, hongweipeng, serhiy.storchaka, steven.daprano, tim.peters
Date 2019-09-12.04:54:07
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1568264048.18.0.834573276733.issue38105@roundup.psfhosted.org>
In-reply-to
Content
Note that you can contrive similar cases with positive hash codes too.  For example, force both hash codes to 2**60 - 2.
 
The salient points are that 0 * 5 is congruent to 0 mod any power of 2, while -2 * 5 = -10 is congruent to -2 mod 8, so they're fixed points of the `i = 5*i + 1 + perturb` iteration _given that_ perturb is congruent to -1 (so cancels out the +1) until enough shifts have been done to get 0 bits into the lowermost bits retained by the mask (in which case perturb is no longer congruent to -1, and so no longer cancels out the +1).  Since PERTURB_SHIFT is 5, on a 64-bit box it can take 13 shifts before `perturb` is entirely cleared.  As internal comments note, it's only _then_ that the recurrence is guaranteed to visit every slot exactly once.

I don't care.  It would be nice if it could guarantee to visit every slot exactly once from the start - which it _would_ do if we didn't use `perturb` at all.  But using `perturb` is extremely valuable for avoiding quadratic-time behavior in large tables in bad real cases.  The drawback is that it can stick in a fixed point for 13 shifts on a 64-bit box (7 shifts on a 32-bit box).  But that's the worst it can do, and it's rare.

I don't believe it's worth a single cycle, overall, to avoid it.
History
Date User Action Args
2019-09-12 04:54:08tim.peterssetrecipients: + tim.peters, steven.daprano, serhiy.storchaka, hongweipeng, crusaderky
2019-09-12 04:54:08tim.peterssetmessageid: <1568264048.18.0.834573276733.issue38105@roundup.psfhosted.org>
2019-09-12 04:54:08tim.peterslinkissue38105 messages
2019-09-12 04:54:07tim.peterscreate