Author tim.peters
Recipients Dmitry Rubanovich, methane, rhettinger, serhiy.storchaka, tim.peters, xiang.zhang
Date 2017-06-16.23:47:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1497656828.37.0.331875222459.issue30671@psf.upfronthosting.co.za>
In-reply-to
Content
Yes, any scheme whatsoever that guarantees to visit every int in range(2**i) meets the "correctness" part.

But I concur with Inada:  "head arguments" aren't compelling here, not even if I understood what they were saying ;-)  If you implement it and demonstrate significant speedups in plausibly realistic settings, then it may be worth pursuing.  Anything short of that is at best informed speculation, and so better hashed out on, e.g., the python-ideas mailing list than on the bug tracker.

BTW, I realize you're not running tests against random data.  But you're not running tests against _any_ real data, which is worrisome.  Your code appears to be utterly uniform, thus approximating a uniform random distribution.  I have scant idea of what you believe your code shows.  For example, it only looks at "hash codes" in `range(1<<p)`, which largely misses the point of the "perturb" logic.  In real life, on most boxes these days hash codes are 64 bits (regardless of dict size), and the real perturb code eventually folds in every one of those 64 bits (if the loop goes around often enough to need all of them).  Pretending hash codes contain only `p` bits may or may not be theoretically interesting for some purpose(s), but has nothing obvious to do with how the dict code actually works.
History
Date User Action Args
2017-06-16 23:47:08tim.peterssetrecipients: + tim.peters, rhettinger, methane, serhiy.storchaka, xiang.zhang, Dmitry Rubanovich
2017-06-16 23:47:08tim.peterssetmessageid: <1497656828.37.0.331875222459.issue30671@psf.upfronthosting.co.za>
2017-06-16 23:47:08tim.peterslinkissue30671 messages
2017-06-16 23:47:08tim.peterscreate