This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients Arfrever, Giovanni.Bajo, PaulMcMillan, Vlado.Boza, alex, arigo, benjamin.peterson, camara, christian.heimes, dmalcolm, koniiiik, lemburg, serhiy.storchaka, vstinner
Date 2012-11-07.08:34:22
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1352277266.41.0.660849293567.issue14621@psf.upfronthosting.co.za>
In-reply-to
Content
Here's a demo patch (against Python 2.7) which counts hash value collisions and slot collisions. I had posted that in the original ticket where we discussed the hash problem (http://bugs.python.org/issue14621).

This avoids issues like attack 1 mentioned in http://mail.python.org/pipermail/python-dev/2012-January/115726.html

Attack 2 in that email can easily be worked around by reducing the collision limit to a smaller number.

Even better: An application could even dynamically adjust the maximum collision counts by catching the exception and setting a new upper limit depending on its knowledge of the field of application - warning the sysadmin of a potential problem and allowing her to take action. That way the application could start with a low safe maximum collision number of say 100 and then raise the limit in a controlled way.

BTW: When trying out new hash functions, you need to look not only at the performance of the hash function, but also (and more importantly) at the effect on dictionaries.

Just as reminder: The integer key problem is still open. Using the demo script http://bugs.python.org/file24300/integercollision.py, it's easy to keep Python going for minutes without any major effort.

I don't understand why we are only trying to fix the string problem and completely ignore other key types. Strings are easy to send to a web server, yes, but there are other applications out there which take input data from other sources/formats as well (e.g. csv files). And it's not unusual to convert input strings to integers to use them as dictionary keys, say item IDs or counts. So while the string keys may not cause a problem, the integer keys still might.
History
Date User Action Args
2012-11-07 08:34:26lemburgsetrecipients: + lemburg, arigo, vstinner, christian.heimes, benjamin.peterson, Arfrever, alex, dmalcolm, Giovanni.Bajo, PaulMcMillan, serhiy.storchaka, Vlado.Boza, koniiiik, camara
2012-11-07 08:34:26lemburgsetmessageid: <1352277266.41.0.660849293567.issue14621@psf.upfronthosting.co.za>
2012-11-07 08:34:26lemburglinkissue14621 messages
2012-11-07 08:34:24lemburgcreate