Message105450
The problem is apparently due to the fact that small_set -= large_set
iterates over the large set removing entries from small_set while more
efficient small_set - large_set builds a new set iterating over a
small_set and adding items that are not in the large set. Since
lookups are O(1), it is more efficient to iterate over a small set
while looking up a large set than the other way around. I am
attaching a minimalist patch which gives up in-place update for s1 -=
s2 if len(s1) < len(s2) and effectively replaces it with s1 = s1 - s2.
The code can be further optimized by replicating set_difference logic
in set_isub instead of relying on fall back behavior, but the big
question here with whether it is acceptable to give up preserving set
identity in in-place subtract to gain performance. |
|
Date |
User |
Action |
Args |
2010-05-10 17:44:37 | belopolsky | set | recipients:
+ belopolsky, rhettinger, eric.smith, ezio.melotti, abacabadabacaba |
2010-05-10 17:44:35 | belopolsky | link | issue8425 messages |
2010-05-10 17:44:35 | belopolsky | create | |
|