This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author castironpi
Recipients castironpi
Date 2014-07-26.15:52:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1406389941.41.0.168391014696.issue22084@psf.upfronthosting.co.za>
In-reply-to
Content
Hi, I asked about the inconsistency of the "RuntimeError" being raised when mutating a container while iterating over it here [1], "set and dict iteration" on Aug 16, 2012.

[1] http://www.gossamer-threads.com/lists/python/python/1004659

I posted a patch on the ML but never submitted it.  People's reaction seemed ambivalent.  Now I have an idea for a different implementation.  I'd like to take another shot at it.

It's one of the worst silent errors, since there's an error in the *iterator* when we call a *set* method.  We're going to add something to make it safer, at least in the sense of getting a clear failure, if the programmer does something that's always been ill-advised.

We have a number of options for the implementation.  We still have the option to introduce "IterationError", possibly a subclass of "RuntimeError".  These options are still applicable:

1) Collection of iterators
. Invalidate all "open" iterators on mutating
. a) Linked list
.. i) Uncounted references
.. ii) Counted references
.. iii) Weak references
. b) Weak set
2) Version index / timestamp / "memo"
. Iterators check whether the container has been mutated
    since they were created
. a) No overflow - Python longs
.. i) Reset index if no iterators left
. b) Overflow - C ints / shorts (silent error)
3) Iterator count
. Raise exception on mutation, not iteration

The new option is:

2d) Use a dedicated empty *object* for a timestamp or "memo".  A new memo is created on every mutation.  Before advancing, the iterator checks whether the current memo is a different object than it was when it was created.

Costs: The existing silent error is fairly rare.  The container gains a pointer to its current memo.  The iterator loses the cached length but gains a pointer to a memo.  The memos are blank objects: a "Py ssize t" and a pointer with certain flags at time of writing.  Speed is the same: comparing the lengths is replaced with comparing the memos.

Some caveats: The memory manager is used to obtain perpetually unique IDs.  A unique algorithm could be used instead of the memory manager, though the memo needs to contain a reference count more or less regardless.  There can at most be one memo per iterator.  The approach is outlined in pseudocode here [2]. Implementation could be optimized slightly by only creating new memos if iterators have been opened, shown here [3].

[2] http://home.comcast.net/~castironpi-misc/irc-0168%20mutating%20while%20iterating%20markup.html

[3] http://home.comcast.net/~castironpi-misc/irc-0168%20mutating%20while%20iterating%202%20markup.html
History
Date User Action Args
2014-07-26 15:52:21castironpisetrecipients: + castironpi
2014-07-26 15:52:21castironpisetmessageid: <1406389941.41.0.168391014696.issue22084@psf.upfronthosting.co.za>
2014-07-26 15:52:21castironpilinkissue22084 messages
2014-07-26 15:52:20castironpicreate