Author lemburg
Recipients christian.heimes, inada.naoki, lemburg, rhettinger, serhiy.storchaka
Date 2017-02-01.12:28:15
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <5cfb9665-4218-1302-feac-7d10720e4a8c@egenix.com>
In-reply-to <1485950624.11.0.457171347122.issue29410@psf.upfronthosting.co.za>
Content
On 01.02.2017 13:03, INADA Naoki wrote:
> Maybe, we should remove Py_HASH_CUTOFF completely?

I think we ought to look for a better hash algorithm
for short strings, e.g. a CRC based one.

Some interesting investigations on this:
http://www.orthogonal.com.au/computers/hashstrings/
http://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed

PS: It may also be wise not to use the hash randomization
with these, since the secret would leak. Again, collision
counting comes to mind, since for short strings it is much
more important to have a good bucket distribution than
crypto security to protect hash secrets :-)
History
Date User Action Args
2017-02-01 12:28:15lemburgsetrecipients: + lemburg, rhettinger, christian.heimes, inada.naoki, serhiy.storchaka
2017-02-01 12:28:15lemburglinkissue29410 messages
2017-02-01 12:28:15lemburgcreate