This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients Arfrever, Giovanni.Bajo, PaulMcMillan, ReneSac, Vlado.Boza, alex, arigo, benjamin.peterson, camara, christian.heimes, cvrebert, dmalcolm, gregory.p.smith, koniiiik, lemburg, mark.dickinson, sbermeister, serhiy.storchaka, vstinner, Łukasz.Rekucki
Date 2012-11-30.20:34:36
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
In-reply-to <>
On 30.11.2012 21:06, Serhiy Storchaka wrote:
> Serhiy Storchaka added the comment:
> René, a balanced tree requires rebalancing on every (or almost every) item for some special (sorted) data sequences.

Sure, but that's still O(N*log N) for an attack scenario, not O(N^2).

I think the main point here is that using hash tables or dictionaries
for these things without any size limit is simply a wrong development

Developers need to be made aware of the problem and given data
structures that they can use more safely to store the data and
instead of trying to hide away the problem using some crypto
approach, it's better to offer methods that allow developers to
gain control over the problem (e.g. via an exception) rather than
hoping for few hash collisions.

If a developer has to build a lookup table from untrusted data,
she should be able to say:

    mapping = insert_untrusted_data(source)
except UnderAttackError:
    return 'no thank you'

instead of:

# Hmm, let's hope this doesn't bomb...
mapping = insert_untrusted_data(source)

At the moment, the best thing you can do is insert the data
in chunks and measure the time it takes for each chunk. If it
takes too long, you raise the UnderAttackError.

If we make the collision counting limit a per-dictionary adjustable
limit with some global default limit, the above could be written

    mapping = {}
    mapping.max_collisions = 100
except CollisionLimitError:
    return 'no thank you'

Aside: It's better to know you worst case and program for it, rather
than to ignore the problem and hope an attacker won't find your secret
key. In the above case, if you know what the worst-case timing is for
a 100-item dictionary, you can safely deal with it, possibly adjusting
the limit to more suitable value for your application.

Marc-Andre Lemburg

Professional Python Services directly from the Source  (#1, Nov 30 2012)
>>> Python Projects, Consulting and Support ...
>>> mxODBC.Zope/Plone.Database.Adapter ...
>>> mxODBC, mxDateTime, mxTextTools ...
2012-11-28: Released eGenix mx Base 3.2.5 ...
2013-01-22: Python Meeting Duesseldorf ...                 53 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! :::: Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
Date User Action Args
2012-11-30 20:34:37lemburgsetrecipients: + lemburg, arigo, gregory.p.smith, mark.dickinson, vstinner, christian.heimes, benjamin.peterson, Arfrever, alex, cvrebert, dmalcolm, Giovanni.Bajo, PaulMcMillan, serhiy.storchaka, Vlado.Boza, koniiiik, sbermeister, camara, Łukasz.Rekucki, ReneSac
2012-11-30 20:34:37lemburglinkissue14621 messages
2012-11-30 20:34:36lemburgcreate