Author dmalcolm
Recipients Arach, Arfrever, Huzaifa.Sidhpurwala, Jim.Jewett, Mark.Shannon, PaulMcMillan, Zhiping.Deng, alex, barry, benjamin.peterson, christian.heimes, dmalcolm, eric.snow, fx5, georg.brandl, grahamd, gregory.p.smith, gvanrossum, gz, haypo, jcea, lemburg, mark.dickinson, merwok, neologix, pitrou, skrah, terry.reedy, tim.peters, v+python, zbysz
Date 2012-01-21.21:07:40
SpamBayes Score 1.41266e-09
Marked as misclassified No
Message-id <1327176822.4992.257.camel@surprise>
In-reply-to <1327172141.3382.12.camel@localhost.localdomain>
Content
Well, the old attempt was hardly robust :)

Can anyone see any vulnerabilities in this approach?

Yeah; I was mostly trying to add raw data (to help me debug the
implementation).

I wonder if the dict statistics should be exposed with extra attributes
or a method on the dict; e.g. a __stats__ attribute, something like
this:

LargeDictStats(keys=58, mask=127, insertions=53, iterations=1697)

SmallDictStats(keys=3, mask=7)

or somesuch. Though that's a detail, I think.

> > Caveats:
> > * no overflow handling (what happens after 2**32 modifications to a
> > long-lived dict on a 32-bit build?) - though that's fixable.
> 
> How do you suggest to fix it?

If the dict is heading towards overflow of these counters, it's either
long-lived, or *huge*.

Possible approaches:
(a) use 64-bit counters rather than 32-bit, though that's simply
delaying the inevitable
(b) when one of the counters gets large, divide both of them by a
constant (e.g. 2).  We're interested in their ratio, so dividing both by
a constant preserves this.

By "a constant" do you mean from the perspective of big-O notation, or
do you mean that it should be hardcoded (I was wondering if it should be
a sys variable/environment variable etc?).

> We're interested in the actual slowdown factor, which a constant factor
> models adequately. It's the slowdown factor which makes a DOS attack
> using this technique efficient. Whether or not dict construction truely
> degenerates into a O(n**2) operation is less relevant.

OK.

> There needs to be a way to disable it: an environment variable would be
> the minimum IMO.

e.g. set it to 0 to enable it, set it to nonzero to set the scale
factor.
Any idea what to call it? 

PYTHONALGORITHMICCOMPLEXITYTHRESHOLD=0 would be quite a mouthful.

OK

BTW, presumably if we do it, we should do it for sets as well?
History
Date User Action Args
2012-01-21 21:07:41dmalcolmsetrecipients: + dmalcolm, lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, gregory.p.smith, jcea, mark.dickinson, pitrou, haypo, christian.heimes, benjamin.peterson, merwok, grahamd, Arfrever, v+python, alex, zbysz, skrah, gz, neologix, Arach, Mark.Shannon, eric.snow, Zhiping.Deng, Huzaifa.Sidhpurwala, Jim.Jewett, PaulMcMillan, fx5
2012-01-21 21:07:41dmalcolmlinkissue13703 messages
2012-01-21 21:07:40dmalcolmcreate