Message224071
Hi, I asked about the inconsistency of the "RuntimeError" being raised when mutating a container while iterating over it here [1], "set and dict iteration" on Aug 16, 2012.
[1] http://www.gossamer-threads.com/lists/python/python/1004659
I posted a patch on the ML but never submitted it. People's reaction seemed ambivalent. Now I have an idea for a different implementation. I'd like to take another shot at it.
It's one of the worst silent errors, since there's an error in the *iterator* when we call a *set* method. We're going to add something to make it safer, at least in the sense of getting a clear failure, if the programmer does something that's always been ill-advised.
We have a number of options for the implementation. We still have the option to introduce "IterationError", possibly a subclass of "RuntimeError". These options are still applicable:
1) Collection of iterators
. Invalidate all "open" iterators on mutating
. a) Linked list
.. i) Uncounted references
.. ii) Counted references
.. iii) Weak references
. b) Weak set
2) Version index / timestamp / "memo"
. Iterators check whether the container has been mutated
since they were created
. a) No overflow - Python longs
.. i) Reset index if no iterators left
. b) Overflow - C ints / shorts (silent error)
3) Iterator count
. Raise exception on mutation, not iteration
The new option is:
2d) Use a dedicated empty *object* for a timestamp or "memo". A new memo is created on every mutation. Before advancing, the iterator checks whether the current memo is a different object than it was when it was created.
Costs: The existing silent error is fairly rare. The container gains a pointer to its current memo. The iterator loses the cached length but gains a pointer to a memo. The memos are blank objects: a "Py ssize t" and a pointer with certain flags at time of writing. Speed is the same: comparing the lengths is replaced with comparing the memos.
Some caveats: The memory manager is used to obtain perpetually unique IDs. A unique algorithm could be used instead of the memory manager, though the memo needs to contain a reference count more or less regardless. There can at most be one memo per iterator. The approach is outlined in pseudocode here [2]. Implementation could be optimized slightly by only creating new memos if iterators have been opened, shown here [3].
[2] http://home.comcast.net/~castironpi-misc/irc-0168%20mutating%20while%20iterating%20markup.html
[3] http://home.comcast.net/~castironpi-misc/irc-0168%20mutating%20while%20iterating%202%20markup.html |
|
Date |
User |
Action |
Args |
2014-07-26 15:52:21 | castironpi | set | recipients:
+ castironpi |
2014-07-26 15:52:21 | castironpi | set | messageid: <1406389941.41.0.168391014696.issue22084@psf.upfronthosting.co.za> |
2014-07-26 15:52:21 | castironpi | link | issue22084 messages |
2014-07-26 15:52:20 | castironpi | create | |
|