Author christian.heimes
Recipients Arfrever, Giovanni.Bajo, PaulMcMillan, Vlado.Boza, alex, arigo, benjamin.peterson, camara, christian.heimes, dmalcolm, haypo, koniiiik, lemburg, serhiy.storchaka
Date 2012-11-06.17:10:42
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1352221845.63.0.359157467628.issue14621@psf.upfronthosting.co.za>
In-reply-to
Content
I modified crypto_auth() a bit:

Py_uhash_t crypto_auth(const unsigned char *in, unsigned long long inlen)
  ...
  u64 k0 = _Py_HashSecret.prefix;
  u64 k1 = _Py_HashSecret.suffix;
  ...
  return (Py_uhash_t)b;

and replaced the loop in _Py_HashBytes() with a call to crypto_auth(). For large strings SipHash is as faster as our current algorithm on my 64bit box. That was to be expected as SipHash works on blocks of 8 bytes while the default algorithm can't be optimized with SIMD instructions.

Current hashing algorithm:
$ ./python -m timeit -s "x = b'a' * int(1E7)" "hash(x)"
1000000 loops, best of 3: 0.39 usec per loop

SipHash:
$ ./python -m timeit -s "x = b'a' * int(1E7)" "hash(x)"
1000000 loops, best of 3: 0.381 usec per loop
History
Date User Action Args
2012-11-06 17:10:47christian.heimessetrecipients: + christian.heimes, lemburg, arigo, haypo, benjamin.peterson, Arfrever, alex, dmalcolm, Giovanni.Bajo, PaulMcMillan, serhiy.storchaka, Vlado.Boza, koniiiik, camara
2012-11-06 17:10:45christian.heimessetmessageid: <1352221845.63.0.359157467628.issue14621@psf.upfronthosting.co.za>
2012-11-06 17:10:45christian.heimeslinkissue14621 messages
2012-11-06 17:10:42christian.heimescreate