This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author christian.heimes
Recipients Arfrever, PaulMcMillan, Vlado.Boza, benjamin.peterson, christian.heimes, dmalcolm, koniiiik, vstinner
Date 2012-10-17.16:53:31
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1350492811.58.0.0601543791813.issue14621@psf.upfronthosting.co.za>
In-reply-to
Content
I've modified unicodeobject's unicode_hash() function. V8's algorithm is about 55% slower for a 800 MB ASCII string on my box.

Python's current hash algorithm for bytes and unicode:

   while (--len >= 0)
        x = (_PyHASH_MULTIPLIER * x) ^ (Py_uhash_t) *P++;

$ ./python -m timeit -s "t = 'abcdefgh' * int(1E8)" "hash(t)"
10 loops, best of 3: 94.1 msec per loop


V8's algorithm:

    while (--len >= 0) {
        x += (Py_uhash_t) *P++;
        x += ((x + (Py_uhash_t)len) << 10);
        x ^= (x >> 6);
    }

$ ./python -m timeit -s "t = 'abcdefgh' * int(1E8)" "hash(t)"
10 loops, best of 3: 164 msec per loop
History
Date User Action Args
2012-10-17 16:53:31christian.heimessetrecipients: + christian.heimes, vstinner, benjamin.peterson, Arfrever, dmalcolm, PaulMcMillan, Vlado.Boza, koniiiik
2012-10-17 16:53:31christian.heimessetmessageid: <1350492811.58.0.0601543791813.issue14621@psf.upfronthosting.co.za>
2012-10-17 16:53:31christian.heimeslinkissue14621 messages
2012-10-17 16:53:31christian.heimescreate