Message312250
Luís, what you really need for the problem you outline is an immutable set with one operation, 'in', aside from create and delete. If specialized to strings only, it could internally stores only character sequences, rather than Python objects. I suspect someone has written such a class (in C), but did not find anything on pypi.python.org.
Lacking that, the following experiment suggests that you might be able to restore near O(n) time by partitioning the set of keys into a collection of sets. My 16M set took 6.48 and 1.80 seconds. Times 4 is 25.9 and 7.2. My 64M set took 28 and 19 seconds, but 4 16M sets take 26.3 and 7.5 seconds, only about 5% more than the x4 target.
print(timeit.Timer(
's = [{str(n) for n in range(i*1000000, (i+16)*1000000)}'
' for i in (0, 16, 32, 48)]')
.timeit(number=1))
print(timeit.Timer(
'del s',
's = [{str(n) for n in range(i*1000000, (i+16)*1000000)}'
' for i in (0, 16, 32, 48)]')
.timeit(number=1)) |
|
Date |
User |
Action |
Args |
2018-02-16 19:57:30 | terry.reedy | set | recipients:
+ terry.reedy, lemburg, tim.peters, twouters, rhettinger, vstinner, luis@luispedro.org, methane, serhiy.storchaka |
2018-02-16 19:57:30 | terry.reedy | set | messageid: <1518811050.87.0.467229070634.issue32846@psf.upfronthosting.co.za> |
2018-02-16 19:57:30 | terry.reedy | link | issue32846 messages |
2018-02-16 19:57:30 | terry.reedy | create | |
|