This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ReneSac
Recipients Arfrever, Giovanni.Bajo, PaulMcMillan, ReneSac, Vlado.Boza, alex, arigo, benjamin.peterson, camara, christian.heimes, cvrebert, dmalcolm, gregory.p.smith, koniiiik, lemburg, mark.dickinson, sbermeister, serhiy.storchaka, vstinner, Łukasz.Rekucki
Date 2012-11-30.18:12:09
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1354299130.95.0.556177271791.issue14621@psf.upfronthosting.co.za>
In-reply-to
Content
Christian Heimes: It has always been trivial to artificially generate collisions for fast hashes designed for hash tables, like MurmurHash. I wouldn't call Murmurhash3 "busted" because of that, as this was never a design goal. It is a known propriety of this type of hash: you do that basically running them backwards. You can't even call that cryptanalysis. 

Of course, you need the seed to search those collisions, but from this thread it seems very difficult, if not impossible, not to leak the random seed to the attacker. 

I see the various collision counting alternatives proposed as the less intrusive and definitive solution for this problem. It also has the benefit that it can work for any type of key. Some pseudo code:

hash as always, with a fast hash.
if reprobes > initial_threshold:
    if the table has only one key type AND it has a robust comparison method:
        store the collisions in a O(log n) worst case structure (tree).
    elif the object has a secondary slow secure hash:
        try searching/inserting the key again with the new secure hash.
        It works like a double hashing reprobing hash table.
    elif collisions > max_threshold:
        raise Exception("Under attack or the object is using a crappy hash, with no fallback.")

The first option, the O(log n) structure, can be ignored as unnecessary complication (even though there is already a path implementing that), but I suspect it may be faster than a secure hash. If not, then there is really no point in this option, except if the secure hash proves to be not so secure.
History
Date User Action Args
2012-11-30 18:12:11ReneSacsetrecipients: + ReneSac, lemburg, arigo, gregory.p.smith, mark.dickinson, vstinner, christian.heimes, benjamin.peterson, Arfrever, alex, cvrebert, dmalcolm, Giovanni.Bajo, PaulMcMillan, serhiy.storchaka, Vlado.Boza, koniiiik, sbermeister, camara, Łukasz.Rekucki
2012-11-30 18:12:10ReneSacsetmessageid: <1354299130.95.0.556177271791.issue14621@psf.upfronthosting.co.za>
2012-11-30 18:12:10ReneSaclinkissue14621 messages
2012-11-30 18:12:09ReneSaccreate