Message151745
> I wonder if the dict statistics should be exposed with extra attributes
> or a method on the dict; e.g. a __stats__ attribute, something like
> this:
>
> LargeDictStats(keys=58, mask=127, insertions=53, iterations=1697)
>
> SmallDictStats(keys=3, mask=7)
Sounds a bit overkill, and it shouldn't be a public API (which
__methods__ are). Even a private API on dicts would quickly become
visible, since dicts are so pervasive.
> > > Caveats:
> > > * no overflow handling (what happens after 2**32 modifications to a
> > > long-lived dict on a 32-bit build?) - though that's fixable.
> >
> > How do you suggest to fix it?
>
> If the dict is heading towards overflow of these counters, it's either
> long-lived, or *huge*.
>
> Possible approaches:
> (a) use 64-bit counters rather than 32-bit, though that's simply
> delaying the inevitable
Well, even assuming one billion lookup probes per second on a single
dictionary, the inevitable will happen in 584 years with a 64-bit
counter (but only 4 seconds with a 32-bit counter).
A real issue, though, may be the cost of 64-bit arithmetic on 32-bit
CPUs.
> (b) when one of the counters gets large, divide both of them by a
> constant (e.g. 2). We're interested in their ratio, so dividing both by
> a constant preserves this.
Sounds good, although we may want to pull this outside of the critical
loop.
> By "a constant" do you mean from the perspective of big-O notation, or
> do you mean that it should be hardcoded (I was wondering if it should be
> a sys variable/environment variable etc?).
Hardcoded, as in your patch.
> > There needs to be a way to disable it: an environment variable would be
> > the minimum IMO.
>
> e.g. set it to 0 to enable it, set it to nonzero to set the scale
> factor.
0 to enable it sounds misleading. I'd say:
- 0 to disable it
- 1 to enable it and use the default scaling factor
- >= 2 to enable it and set the scaling factor
> Any idea what to call it?
PYTHONDICTPROTECTION ?
Most people should either enable or disable it, not change the scaling
factor.
> BTW, presumably if we do it, we should do it for sets as well?
Yeah, and use the same env var / sys function. |
|
Date |
User |
Action |
Args |
2012-01-21 22:20:48 | pitrou | set | recipients:
+ pitrou, lemburg, gvanrossum, tim.peters, barry, georg.brandl, terry.reedy, gregory.p.smith, jcea, mark.dickinson, vstinner, christian.heimes, benjamin.peterson, eric.araujo, grahamd, Arfrever, v+python, alex, zbysz, skrah, dmalcolm, gz, neologix, Arach, Mark.Shannon, eric.snow, Zhiping.Deng, Huzaifa.Sidhpurwala, Jim.Jewett, PaulMcMillan, fx5 |
2012-01-21 22:20:47 | pitrou | link | issue13703 messages |
2012-01-21 22:20:47 | pitrou | create | |
|