This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients Arach, Arfrever, Huzaifa.Sidhpurwala, Jim.Jewett, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, eric.araujo, eric.snow, fx5, georg.brandl, grahamd, gregory.p.smith, gvanrossum, gz, jcea, lemburg, mark.dickinson, neologix, pitrou, skrah, terry.reedy, tim.peters, v+python, vstinner, zbysz
Date 2012-01-23.13:07:24
SpamBayes Score 1.0272339e-12
Marked as misclassified No
Message-id <4F1D5B87.8050907@egenix.com>
In-reply-to <1327100080.4992.149.camel@surprise>
Content
Dave Malcolm wrote:
> 
> Dave Malcolm <dmalcolm@redhat.com> added the comment:
> 
> On Fri, 2012-01-06 at 12:52 +0000, Marc-Andre Lemburg wrote:
>> Marc-Andre Lemburg <mal@egenix.com> added the comment:
>>
>> Demo patch implementing the collision limit idea for Python 2.7.
>>
>> ----------
>> Added file: http://bugs.python.org/file24151/hash-attack.patch
>>
> 
> Marc: is this the latest version of your patch?

Yes. As mentioned in the above message, it's just a demo of how
the collision limit idea can be implemented.

> Whether or not we go with collision counting and/or adding a random salt
> to hashes and/or something else, I've had a go at updating your patch
> 
> Although debate on python-dev seems to have turned against the
> collision-counting idea, based on flaws reported by Frank Sievertsen
> http://mail.python.org/pipermail/python-dev/2012-January/115726.html
> it seemed to me to be worth at least adding some test cases to flesh out
> the approach.  Note that the test cases deliberately avoid containing
> "hostile" data.

Martin's example is really just a red herring: it doesn't matter
where the hostile data originates or how it gets into the application.
There are many ways an attacker can get the O(n^2) worst case
timing triggered.

Frank's example is an attack on the second possible way to
trigger the O(n^2) behavior. See msg150724 further above where I
listed the two possibilities:

"""
An attack can be based on trying to find many objects with the same
hash value, or trying to find many objects that, as they get inserted
into a dictionary, very often cause collisions due to the collision
resolution algorithm not finding a free slot.
"""

My demo patch only addresses the first variant. In order to cover
the second variant as well, you'd have to count and limit the
number of iterations in the perturb for-loop of the lookdict()
functions where the hash value of the slot does not match the
key's hash value.

Note that the second variant is both a lot less likely to trigger
(due to the dict getting resized on a regular basis) and the
code involved a lot faster than the code for the first
variant (which requires a costly object comparison), so the
limit for the second variant would have to be somewhat higher
than for the first.

BTW: The collision counting patch chunk for the string dicts in my
demo patch is wrong. I've attached a corrected version. In the
original patch it was counting both collision variants with the
same counter and limit.
Files
File name Uploaded
hash-attack-2.patch lemburg, 2012-01-23.13:07:23
History
Date User Action Args
2012-01-23 13:07:26lemburgsetrecipients: + lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, gregory.p.smith, jcea, mark.dickinson, pitrou, vstinner, christian.heimes, benjamin.peterson, eric.araujo, grahamd, Arfrever, v+python, alex, zbysz, skrah, dmalcolm, gz, neologix, Arach, Mark.Shannon, eric.snow, Zhiping.Deng, Huzaifa.Sidhpurwala, Jim.Jewett, PaulMcMillan, fx5
2012-01-23 13:07:25lemburglinkissue13703 messages
2012-01-23 13:07:24lemburgcreate