Message291020
I also feel set created from iterator is too large sometimes.
Especially, frozenset is used for "unordered tuple" dict key sometimes.
$ python -c 'N = 6000; from sys import getsizeof; s = [getsizeof(frozenset(range(n))) for n in range(N)]; print( [(n, s[n]) for n in range(N) if not n or s[n] != s[n-1]] )'
[(0, 224), (5, 736), (21, 2272), (85, 8416), (341, 32992), (1365, 131296), (5461, 524512)]
$ python -c 'N = 6000; from sys import getsizeof; s = [getsizeof(dict.fromkeys(range(n))) for n in range(N)]; print( [(n, s[n]) for n in range(N) if not n or s[n] != s[n-1]] )'
[(0, 240), (6, 368), (11, 648), (22, 1184), (43, 2280), (86, 4704), (171, 9320), (342, 18528), (683, 36968), (1366, 73824), (2731, 147560), (5462, 295008)]
set size increases by *4 each resize.
This is code of set_add_entry():
if ((size_t)so->fill*5 < mask*3)
return 0;
return set_table_resize(so, so->used>50000 ? so->used*2 : so->used*4);
If we simply replace `so->used>50000 ? so->used*2 : so->used*4` with `so->used*2`, there may be not enough room when there are some dummy entries. (used + dummy = fill).
But how about replacing `*4` with `*3`?
In general, load factor after resize will be changed from 12.5% ~ 25% to 16.6% ~ 33.3%. It seems OK to me.
In case of no removal (e.g. constructor called with iterator), set size increases by *2 each time:
>>> for n in range(3, 20):
... print(math.ceil(2 ** n * 3/5) * 3, 2 ** (n+1))
15 16
30 32
60 64
117 128
231 256
462 512
924 1024
1845 2048
3687 4096
7374 8192
14748 16384
29493 32768
58983 65536
117966 131072
235932 262144
471861 524288
943719 1048576 |
|
Date |
User |
Action |
Args |
2017-04-02 04:05:58 | methane | set | recipients:
+ methane, rhettinger, serhiy.storchaka |
2017-04-02 04:05:57 | methane | set | messageid: <1491105957.4.0.947348008783.issue29961@psf.upfronthosting.co.za> |
2017-04-02 04:05:57 | methane | link | issue29961 messages |
2017-04-02 04:05:55 | methane | create | |
|