Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guard against changing dict during iteration #63531

Closed
serhiy-storchaka opened this issue Oct 21, 2013 · 9 comments
Closed

Guard against changing dict during iteration #63531

serhiy-storchaka opened this issue Oct 21, 2013 · 9 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

BPO 19332
Nosy @tim-one, @rhettinger, @pitrou, @ethanfurman, @serhiy-storchaka
Files
  • dict_mutating_iteration.patch
  • dict_mutating_iteration_2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/rhettinger'
    closed_at = <Date 2013-10-28.06:15:54.523>
    created_at = <Date 2013-10-21.14:02:54.864>
    labels = ['interpreter-core', 'type-feature']
    title = 'Guard against changing dict during iteration'
    updated_at = <Date 2016-01-10.23:43:55.005>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2016-01-10.23:43:55.005>
    actor = 'python-dev'
    assignee = 'rhettinger'
    closed = True
    closed_date = <Date 2013-10-28.06:15:54.523>
    closer = 'rhettinger'
    components = ['Interpreter Core']
    creation = <Date 2013-10-21.14:02:54.864>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['32279', '32319']
    hgrepos = []
    issue_num = 19332
    keywords = ['patch']
    message_count = 9.0
    messages = ['200784', '200995', '201062', '201065', '201156', '201780', '202262', '202287', '257946']
    nosy_count = 6.0
    nosy_names = ['tim.peters', 'rhettinger', 'pitrou', 'ethan.furman', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'rejected'
    stage = 'patch review'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue19332'
    versions = ['Python 3.4']

    @serhiy-storchaka
    Copy link
    Member Author

    Currently dict iterating is guarded against changing dict's size. However when dict changed during iteration so that it's size left unchanged, this modification left unnoticed.

    >>> d = dict.fromkeys('abcd')
    >>> for i in d:
    ...     print(i)
    ...     d[i + 'x'] = None
    ...     del d[i]
    ... 
    d
    a
    dx
    dxx
    ax
    c
    b

    In general iterating over mutating dict considered logical error. It is good detect it as early as possible.

    The proposed patch introduces a counter which changed every time when added or removed key. If an iterator detects that this counter is changed, it raises runtime error.

    @serhiy-storchaka serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Oct 21, 2013
    @rhettinger
    Copy link
    Contributor

    The decision to not monitor adding or removing keys was intentional. It is just not worth the cost in either time or space.

    @rhettinger rhettinger self-assigned this Oct 23, 2013
    @serhiy-storchaka
    Copy link
    Member Author

    In the first patch the counter was placed in the _dictkeysobject structure. In the second place it is placed in the PyDictObject so it now has no memory cost. Access time to new counter for non-modifying operations is same as in current code. The only additional cost is time cost for modifying operations. But modifying operations is usually much rare than non-modifying operations, and the incrementing one field takes only small part of the time needed for all operation. I don't think this will affect total performance of real programs.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 23, 2013

    If there's no performance regression, then this sounds like a reasonable idea. The remaining question would be whether it can break existing code. Perhaps you should ask python-dev?

    @rhettinger
    Copy link
    Contributor

    I disagree with adding such unimportant code to the critical path.

    @ethanfurman
    Copy link
    Member

    Raymond, please don't be so concise.

    Is the code unimportant because the scenario is so rare, or something else?

    @stevendaprano
    Copy link
    Member

    Duplicate of this: http://bugs.python.org/issue6017

    @rhettinger
    Copy link
    Contributor

    A few thoughts:

    • No existing, working code will benefit from this patch; however, almost all code will pay a price for it -- bigger size for an empty dict and a runtime cost (possibly very small) on the critical path (every time a value is stored in a dict).

    • The sole benefit of the patch is provide an earlier warning that someone is doing something weird. For most people, this will never come up (we have 23 years of Python history indicating that there isn't a real problem to that needs to be solved).

    • The normal rule (not just for Python) is that a data structures have undefined behavior for mutating while iterating, unless there is a specific guarantee (for example, we guarantee that the dicts are allowed to mutate values but not keys during iteration and we guarantee the behavior of list iteration while iterating).

    • It is not clear that other implementations such as IronPython and Jython would be able to implement this behavior (Jython wraps the Java ConcurrentHashMap).

    • The current patch second guesses a decision that was made long ago to only detect size changes (because it is cheap, doesn't take extra memory, isn't on the critical path, and handles the common case).

    • The only case whether we truly need a stronger protection is when it is needed to defend against a segfault. That is why collections.deque() implement a change counter. It has a measureable cost that slows down deque operations (increasing the number of memory accesses per append, pop, or next) but it is needed to prevent the iterator from spilling into freed memory.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jan 10, 2016

    New changeset a576199a5350 by Victor Stinner in branch 'default':
    PEP-509
    https://hg.python.org/peps/rev/a576199a5350

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants