This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author jdemeyer
Recipients eric.smith, jdemeyer, mark.dickinson, rhettinger, sir-sigurd, tim.peters
Date 2018-10-02.14:42:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1538491361.95.0.545547206417.issue34751@psf.upfronthosting.co.za>
In-reply-to
Content
> 100% pure SeaHash does x ^= t at the start first, instead of `t ^ (t << 1)` on the RHS.

Indeed. Some initial testing shows that this kind of "input mangling" (applying such a permutation on the inputs) actually plays a much more important role to avoid collisions than the SeaHash operation x ^= ((x >> 16) >> (x >> 29)).

So my suggestion remains

for y in INPUT:
    t = hash(y)
    t ^= t * SOME_LARGE_EVEN_NUMBER
    h ^= t
    h *= MULTIPLIER

Adding in the additional SeaHash operations

    x ^= ((x >> 16) >> (x >> 29))
    x *= MULTIPLIER

does not increase the probability of the tests passing.
History
Date User Action Args
2018-10-02 14:42:41jdemeyersetrecipients: + jdemeyer, tim.peters, rhettinger, mark.dickinson, eric.smith, sir-sigurd
2018-10-02 14:42:41jdemeyersetmessageid: <1538491361.95.0.545547206417.issue34751@psf.upfronthosting.co.za>
2018-10-02 14:42:41jdemeyerlinkissue34751 messages
2018-10-02 14:42:41jdemeyercreate