Message 151065 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	Arach, Arfrever, Huzaifa.Sidhpurwala, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, eric.araujo, georg.brandl, gvanrossum, gz, jcea, lemburg, pitrou, skrah, terry.reedy, tim.peters, v+python, vstinner, zbysz
Date	2012-01-11.16:03:18
SpamBayes Score	1.110223e-16
Marked as misclassified	No
Message-id	<4F0DB2C1.6010308@egenix.com>
In-reply-to	<1326293048.3531.6.camel@localhost.localdomain>

Content
Antoine Pitrou wrote: > > Antoine Pitrou <pitrou@free.fr> added the comment: > >> OTOH, the collision counting patch is very simple, doesn't have >> the performance issues and provides real protection against the >> attack. > > I don't know about real protection: you can still slow down dict > construction by 1000x (the number of allowed collisions per lookup), > which can be enough combined with a brute-force DOS. On my slow dev machine 1000 collisions run in around 22ms: python2.7 -m timeit -n 100 "dict((x(264 - 1), 1) for x in xrange(1, 1000))" 100 loops, best of 3: 22.4 msec per loop Using this for a DOS attack would be rather noisy, much unlike sending a single POST. Note that the choice of 1000 as limit is rather arbitrary. I just chose it because it's high enough because it's very unlikely to be hit by an application that is not written to trigger it and it's low enough to still provide a good run-time behavior. Perhaps an even lower figure would be better. > Also, how about false positives? Having legitimate programs break > because of legitimate data would be a disaster. Yes, which is why the patch should be disabled by default (using an env var) in dot-releases. It's probably also a good idea to make the limit configurable to adjust to ones needs. Still, it is very* unlikely that you run into real data causing more than 1000 collisions for a single insert. For full protection the universal hash method idea would have to be implemented (adding a parameter to the hash methods, so that they can be parametrized). This would then allow switching the dict to an alternative hash implementation resolving the collision problem, in case the implementation detects high number of collisions.

Antoine Pitrou wrote:
> 
> Antoine Pitrou <pitrou@free.fr> added the comment:
> 
>> OTOH, the collision counting patch is very simple, doesn't have
>> the performance issues and provides real protection against the
>> attack.
> 
> I don't know about real protection: you can still slow down dict
> construction by 1000x (the number of allowed collisions per lookup),
> which can be enough combined with a brute-force DOS.

On my slow dev machine 1000 collisions run in around 22ms:

python2.7 -m timeit -n 100 "dict((x*(2**64 - 1), 1) for x in xrange(1, 1000))"
100 loops, best of 3: 22.4 msec per loop

Using this for a DOS attack would be rather noisy, much unlike
sending a single POST.

Note that the choice of 1000 as limit is rather arbitrary. I just
chose it because it's high enough because it's very unlikely to be
hit by an application that is not written to trigger it and it's low
enough to still provide a good run-time behavior. Perhaps an
even lower figure would be better.

> Also, how about false positives? Having legitimate programs break
> because of legitimate data would be a disaster.

Yes, which is why the patch should be disabled by default (using
an env var) in dot-releases. It's probably also a good idea to
make the limit configurable to adjust to ones needs.

Still, it is *very* unlikely that you run into real data causing
more than 1000 collisions for a single insert.

For full protection the universal hash method idea would have
to be implemented (adding a parameter to the hash methods, so
that they can be parametrized). This would then allow switching
the dict to an alternative hash implementation resolving the collision
problem, in case the implementation detects high number of
collisions.

History
Date	User	Action	Args
2012-01-11 16:03:20	lemburg	set	recipients: + lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, jcea, pitrou, vstinner, christian.heimes, benjamin.peterson, eric.araujo, Arfrever, v+python, alex, zbysz, skrah, dmalcolm, gz, Arach, Mark.Shannon, Zhiping.Deng, Huzaifa.Sidhpurwala, PaulMcMillan
2012-01-11 16:03:19	lemburg	link	issue13703 messages
2012-01-11 16:03:18	lemburg	create