Message105489
set.difference(s), when s is also a set, basically does::
res = set()
for elem in self:
if elem not in other:
res.add(elem)
This is wasteful when len(self) is much greater than len(other):
$ python -m timeit -s "s = set(range(100000)); sd = s.difference; empty = set()" "sd(empty)"
100 loops, best of 3: 12.8 msec per loop
$ python -m timeit -s "s = set(range(10)); sd = s.difference; empty = set()" "sd(empty)"
1000000 loops, best of 3: 1.18 usec per loop
Here's a patch that compares the lengths of self and other before that loop, and if len(self) is greater, swaps them. The new timeit results are:
$ python -m timeit -s "s = set(range(100000)); sd = s.difference; empty = set()" "sd(empty)"
1000000 loops, best of 3: 0.289 usec per loop
$ python -m timeit -s "s = set(range(10)); sd = s.difference; empty = set()" "sd(empty)"
1000000 loops, best of 3: 0.294 usec per loop |
|
Date |
User |
Action |
Args |
2010-05-11 09:04:19 | spiv | set | recipients:
+ spiv |
2010-05-11 09:04:19 | spiv | set | messageid: <1273568659.49.0.865922101958.issue8685@psf.upfronthosting.co.za> |
2010-05-11 09:04:17 | spiv | link | issue8685 messages |
2010-05-11 09:04:16 | spiv | create | |
|