This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients Dennis Sweeney, eamartin, rhettinger, zach.ware
Date 2021-02-12.02:45:53
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1613097954.42.0.141957374193.issue43198@roundup.psfhosted.org>
In-reply-to
Content
Addenda.  Top use cases for sets (roughly in order of commonality):

1. Eliminate duplicates from an iterable input and loop over the result.
2. Store elements in a set just once but do many membership tests.
3. Perform data analysis on multiple sets using union, intersection, difference, and isdisjoint.
4. Remove elements from a set, one at a time, until it is empty.
5. Algorithms and that alternately add and remove different elements over time (toposort, NFA, DFA, etc).  

The first three are all penalized by an extra inner loop test and the loss of the register to track the freeslot.  Those use cases get no offsetting benefit.

Case 4 doesn't exercise the dummy reuse at all.  So, it is unaffected.

The last is approximately a net wash.  It pays the inner loop price, suffers the loss of a register, and incurs an extra test outside the loop, but it occasionally gets lucky and reuses a freeslot. The benefit for reuse is that it is delays the inevitable resize which would have reclaimed the dummy entries earlier. (The comparable use case for dicts is LRU/LFU caches where new entries are added at the same rate as old entries are removed.  That use case also showed a net wash when freeslot tracking was removed from dicts).

Not on the list at all is the case of a large set, where exactly the same element is added and removed in a tight loop.  The payoff for this case is that the O(n) resize step never occurs; however, this case is so rare that there is no reason to preference it over the common cases.  It now has the same performance as case 5.
History
Date User Action Args
2021-02-12 02:52:36rhettingerunlinkissue43198 messages
2021-02-12 02:45:54rhettingersetrecipients: + rhettinger, zach.ware, eamartin, Dennis Sweeney
2021-02-12 02:45:54rhettingersetmessageid: <1613097954.42.0.141957374193.issue43198@roundup.psfhosted.org>
2021-02-12 02:45:54rhettingerlinkissue43198 messages
2021-02-12 02:45:53rhettingercreate