Message 175053 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Giovanni.Bajo
Recipients	Arfrever, Giovanni.Bajo, PaulMcMillan, Vlado.Boza, alex, arigo, benjamin.peterson, camara, christian.heimes, dmalcolm, koniiiik, lemburg, serhiy.storchaka, vstinner
Date	2012-11-07.08:44:40
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1352277881.07.0.639370104349.issue14621@psf.upfronthosting.co.za>
In-reply-to

Content
Until it's broken with a yet-unknown attack, SipHash is a pseudo-random function and as such it does uniformly distribute values across the output space, and never leak any information on the key (the randomized seed). Being designed by cryptographers, it is likely that it doesn't turn out to be a "fail" like the solution that was just released (no offense intended, but it's been a large-scale PR failure). As long as we don't introduce bias while reducing SipHash's output to fit the hash table size (so for instance, usage of modulus is not appropriate), then the hash function should behave very well. Any data type can be supplied to SipHash, including numbers; you just need to take their (platform-dependent) memory representation and feed it to SipHash. Obviously it will be much much slower than the current function which used to be hash(x) = x (before randomization), but that's the price to pay to avoid security issues.

Until it's broken with a yet-unknown attack, SipHash is a pseudo-random function and as such it does uniformly distribute values across the output space, and never leak any information on the key (the randomized seed). Being designed by cryptographers, it is likely that it doesn't turn out to be a "fail" like the solution that was just released (no offense intended, but it's been a large-scale PR failure).

As long as we don't introduce bias while reducing SipHash's output to fit the hash table size (so for instance, usage of modulus is not appropriate), then the hash function should behave very well.

Any data type can be supplied to SipHash, including numbers; you just need to take their (platform-dependent) memory representation and feed it to SipHash. Obviously it will be much much slower than the current function which used to be hash(x) = x (before randomization), but that's the price to pay to avoid security issues.

History
Date	User	Action	Args
2012-11-07 08:44:41	Giovanni.Bajo	set	recipients: + Giovanni.Bajo, lemburg, arigo, vstinner, christian.heimes, benjamin.peterson, Arfrever, alex, dmalcolm, PaulMcMillan, serhiy.storchaka, Vlado.Boza, koniiiik, camara
2012-11-07 08:44:41	Giovanni.Bajo	set	messageid: <1352277881.07.0.639370104349.issue14621@psf.upfronthosting.co.za>
2012-11-07 08:44:41	Giovanni.Bajo	link	issue14621 messages
2012-11-07 08:44:40	Giovanni.Bajo	create