This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author christian.heimes
Recipients Arfrever, Giovanni.Bajo, PaulMcMillan, Vlado.Boza, alex, arigo, benjamin.peterson, camara, christian.heimes, dmalcolm, koniiiik, lemburg, serhiy.storchaka, vstinner
Date 2012-11-06.17:10:42
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1352221845.63.0.359157467628.issue14621@psf.upfronthosting.co.za>
In-reply-to
Content
I modified crypto_auth() a bit:

Py_uhash_t crypto_auth(const unsigned char *in, unsigned long long inlen)
  ...
  u64 k0 = _Py_HashSecret.prefix;
  u64 k1 = _Py_HashSecret.suffix;
  ...
  return (Py_uhash_t)b;

and replaced the loop in _Py_HashBytes() with a call to crypto_auth(). For large strings SipHash is as faster as our current algorithm on my 64bit box. That was to be expected as SipHash works on blocks of 8 bytes while the default algorithm can't be optimized with SIMD instructions.

Current hashing algorithm:
$ ./python -m timeit -s "x = b'a' * int(1E7)" "hash(x)"
1000000 loops, best of 3: 0.39 usec per loop

SipHash:
$ ./python -m timeit -s "x = b'a' * int(1E7)" "hash(x)"
1000000 loops, best of 3: 0.381 usec per loop
History
Date User Action Args
2012-11-06 17:10:47christian.heimessetrecipients: + christian.heimes, lemburg, arigo, vstinner, benjamin.peterson, Arfrever, alex, dmalcolm, Giovanni.Bajo, PaulMcMillan, serhiy.storchaka, Vlado.Boza, koniiiik, camara
2012-11-06 17:10:45christian.heimessetmessageid: <1352221845.63.0.359157467628.issue14621@psf.upfronthosting.co.za>
2012-11-06 17:10:45christian.heimeslinkissue14621 messages
2012-11-06 17:10:42christian.heimescreate