Author jdemeyer
Recipients eric.smith, jdemeyer, mark.dickinson, rhettinger, sir-sigurd, tim.peters
Date 2018-10-04.06:06:09
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1538633169.89.0.545547206417.issue34751@psf.upfronthosting.co.za>
In-reply-to
Content
> I've posted several SeaHash cores that suffer no collisions at all in any of our tests (including across every "bad example" in these 100+ messages), except for "the new" tuple test.  Which it also passed, most recently with 7 collisions.  That was under 64-bit builds, though, and from what follows I figure you're only looking at 32-bit builds for now.

Note that I'm always considering parametrized versions of the hash functions that I'm testing. I'm replacing the fixed multiplier (all algorithms mentioned here have such a thing) by a random multiplier which is 3 mod 8. I keep the other constants. This allows me to look at the *probability* of passing the testsuite. That says a lot more than a simple yes/no answer for a single test. You don't want to pass the testsuite by random chance, you want to pass the testsuite because you have a high probability of passing it.

And I'm testing 32-bit hashes indeed since we need to support that anyway and the probability of collisions is high enough to get interesting statistical data.

For SeaHash, the probability of passing my new tuple test was only around 55%. For xxHash, this was about 85%. Adding some input mangling improved both scores, but the xxHash variant was still better than SeaHash.
History
Date User Action Args
2018-10-04 06:06:09jdemeyersetrecipients: + jdemeyer, tim.peters, rhettinger, mark.dickinson, eric.smith, sir-sigurd
2018-10-04 06:06:09jdemeyersetmessageid: <1538633169.89.0.545547206417.issue34751@psf.upfronthosting.co.za>
2018-10-04 06:06:09jdemeyerlinkissue34751 messages
2018-10-04 06:06:09jdemeyercreate