Author belopolsky
Recipients abacabadabacaba, belopolsky, eric.smith, ezio.melotti, rhettinger
Date 2010-05-10.17:44:35
SpamBayes Score 0.00177281
Marked as misclassified No
Message-id <AANLkTilSTXfXChxQwwd9Ko5mQUwApt5_p_SB3rvVHg3P@mail.gmail.com>
In-reply-to <1273432523.79.0.0658499297918.issue8425@psf.upfronthosting.co.za>
Content
The problem is apparently due to the fact that small_set -= large_set
iterates over the large set removing entries from small_set while more
efficient small_set - large_set builds a new set iterating over a
small_set and adding items that are not in the large set.  Since
lookups are O(1), it is more efficient to iterate over a small set
while looking up a large set than the other way around.  I am
attaching a minimalist patch which gives up in-place update for s1 -=
s2 if len(s1) < len(s2) and effectively replaces it with s1 = s1 - s2.
 The code can be further optimized by replicating set_difference logic
in set_isub instead of relying on fall back behavior, but the big
question here  with whether it is acceptable to give up preserving set
identity in in-place subtract to gain performance.
Files
File name Uploaded
issue8425.diff belopolsky, 2010-05-10.17:44:34
History
Date User Action Args
2010-05-10 17:44:37belopolskysetrecipients: + belopolsky, rhettinger, eric.smith, ezio.melotti, abacabadabacaba
2010-05-10 17:44:35belopolskylinkissue8425 messages
2010-05-10 17:44:35belopolskycreate