Author tim.peters
Recipients eric.smith, jdemeyer, mark.dickinson, rhettinger, sir-sigurd, tim.peters
Date 2018-10-06.18:18:07
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1538849887.43.0.545547206417.issue34751@psf.upfronthosting.co.za>
In-reply-to
Content
As a sanity check, I tried the 32-bit version of MurmurHash2 (version 3 doesn't have a 32-bit version).  All of my Python tests had collisions, and while the new tuple test dropped to 15, product([0.5, 0.25], repeat=20) skyrocketed from 141 (32-bit xxHash) to 3848.

I could live with that too, but xxHash does better overall on our tiny tests, and requires less code and fewer cycles.

Here's what I used:

static Py_hash_t
tuplehash(PyTupleObject *v)
{
    const Py_uhash_t m = 0x5bd1e995;
    const int r = 24;
    Py_uhash_t x, t;  /* Unsigned for defined overflow behavior. */
    Py_hash_t y;
    Py_ssize_t len = Py_SIZE(v);
    PyObject **p;
    x = 0x345678UL ^ (Py_uhash_t)len;
    p = v->ob_item;
    while (--len >= 0) {
        y = PyObject_Hash(*p++);
        if (y == -1)
            return -1;
        Py_uhash_t t = (Py_uhash_t)y;
        t *= m;
        t ^= t >> r;
        t *= m;
        x *= m;
        x ^= t;
    }
    x ^= x >> 13;
    x *= m;
    x ^= x >> 15;
    if (x == (Py_uhash_t)-1)
        x = -2;
    return x;
}
History
Date User Action Args
2018-10-06 18:18:07tim.peterssetrecipients: + tim.peters, rhettinger, mark.dickinson, eric.smith, jdemeyer, sir-sigurd
2018-10-06 18:18:07tim.peterssetmessageid: <1538849887.43.0.545547206417.issue34751@psf.upfronthosting.co.za>
2018-10-06 18:18:07tim.peterslinkissue34751 messages
2018-10-06 18:18:07tim.peterscreate