Author jdemeyer
Recipients jdemeyer, ned.deily, rhettinger, tim.peters
Date 2018-09-21.06:10:27
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1537510227.26.0.956365154283.issue34751@psf.upfronthosting.co.za>
In-reply-to
Content
To come back to the topic of hashing, with "Bernstein hash" for tuples I did indeed mean the simple (in Python pseudocode):

# t is a tuple
h = INITIAL_VALUE
for x in t:
    z = hash(x)
    h = (h * MULTIPLIER) + z
return h

I actually started working on the code and there is indeed an issue with the hashes of (X, (Y, Z)) and (Y, (X, Z)) being equal that way. But that can be solved by replacing "hash(x)" in the loop by something slightly more complicated. After thinking about it, I decided to go for shifting higher to lower bits, so replacing z by z^(z >> 32) or something like that.

This also has the benefit of mixing the high-order bits into the low-order bits, which is an additional improvement over the current code.

Since I already wrote the code anyway, I'll create a PR for it.
History
Date User Action Args
2018-09-21 06:10:27jdemeyersetrecipients: + jdemeyer, tim.peters, rhettinger, ned.deily
2018-09-21 06:10:27jdemeyersetmessageid: <1537510227.26.0.956365154283.issue34751@psf.upfronthosting.co.za>
2018-09-21 06:10:27jdemeyerlinkissue34751 messages
2018-09-21 06:10:27jdemeyercreate