This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Missed a key when iterating over dictionary
Type: resource usage Stage: resolved
Components: Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Saps, ammar2, tim.peters
Priority: normal Keywords:

Created on 2018-10-19 01:58 by Saps, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
Dictissue.py Saps, 2018-10-19 01:58 Contains dictionary and function
Messages (11)
msg328016 - (view) Author: Sapan (Saps) * Date: 2018-10-19 01:58
The issue occurs in the second level of nested dictionary.Iterating over nested dictionary and editing the key by popping the old key and entering the new key. The next iteration, at the second level of nested dictionary, then skips the second key in dictionary and continues from the third key. In debug mode found that on editing the first key, the new memory allocated points to an address that lies between second and third keys memory address. 

Let me know if some other information is required. I am attaching the python file where I successfully reproduced the issue.
msg328017 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2018-10-19 02:53
Modifying containers while iterating over them is generally not safe. In this case the iterator at the point you start the loop will contain all the items to iterate over, adding them mid-loop will not cause them to be iterated over.

Take a look at the last section here for suggestions: https://docs.python.org/3/tutorial/datastructures.html#looping-techniques
msg328019 - (view) Author: Sapan (Saps) * Date: 2018-10-19 03:03
It makes sense that the it wont re-iterate, but this scenario is totally legit and should be handled in a better way. Iterating through items does not really help here because that will give me a tuple and they are immutable.
msg328021 - (view) Author: Sapan (Saps) * Date: 2018-10-19 03:13
As for creating a new list all together will require quite a lot of work as this particular json is huge and many keys contain periods. Plus there are multiple jsons.
I still believe there should be a better way to handle this.
msg328022 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2018-10-19 03:14
This won't be changed.  The effect on the iteration order of adding and/or removing keys from a dict while iterating over it has never been defined in Python, and never will be.  The Python 3 docs for dict views say this clearly:

"""
Iterating views while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries.
"""

You're experiencing the second symptom ("fail to iterate over all entries").  It's expected.  If you can't do the time, don't do the crime ;-)
msg328023 - (view) Author: Sapan (Saps) * Date: 2018-10-19 03:25
Fair enough. Would like to know the reason though, why is this run time error acceptable ?
msg328024 - (view) Author: Sapan (Saps) * Date: 2018-10-19 03:25
Thanks for the response btw
msg328025 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2018-10-19 03:31
Think about what it means to iterate over a hashmap. Let's say your pop() causes the dictionary to become smaller than the resizing threshold and now the indexes need to be rebuilt, how would this be handled gently by the iterator?

This situation is not just something unique to python dictionaries, google around for "delete while iterating" and you'll find that most languages prohibit you from mutating containers while iterating over them.
msg328026 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2018-10-19 03:36
Questions about Python should be asked, e.g., on the Python mailing list.

The short course is that it's desired that iterating over dicts be as fast as possible, and nobody knows a way to make iteration robust in the face of mutations that wouldn't be significantly slower.  The dict implementation is quite involved, and a single mutation _can_ cause major restructuring of the internal layout.

As is, the dict implementation checks to see whether the dict _size_ has changed across iterations, and raises a RuntimeError if it has.  That's cheap.  When you both delete and add a key between iterations, the dict size doesn't change overall, so that runtime check doesn't trigger.
msg328027 - (view) Author: Sapan (Saps) * Date: 2018-10-19 03:50
Aha I did have that in mind while writing the code. I guess a warning could have helped. Can that be introduced ?
msg328028 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2018-10-19 03:53
Not without more expense.  Which is why it hasn't been done.  But this is a wrong place to talk about it.  If you want Python to change, go to the python-ideas mailing list?

https://mail.python.org/mailman/listinfo/python-ideas
History
Date User Action Args
2022-04-11 14:59:07adminsetgithub: 79204
2018-10-19 03:53:02tim.peterssetmessages: + msg328028
2018-10-19 03:50:47Sapssetmessages: + msg328027
2018-10-19 03:36:49tim.peterssetmessages: + msg328026
2018-10-19 03:31:26ammar2setmessages: + msg328025
2018-10-19 03:25:44Sapssetstatus: pending -> closed

messages: + msg328024
2018-10-19 03:25:10Sapssetstatus: closed -> pending

messages: + msg328023
2018-10-19 03:14:07tim.peterssetstatus: pending -> closed


messages: + msg328022
nosy: + tim.peters
2018-10-19 03:13:02Sapssetstatus: closed -> pending

messages: + msg328021
2018-10-19 03:03:02Sapssetmessages: + msg328019
2018-10-19 02:53:01ammar2setstatus: open -> closed

nosy: + ammar2
messages: + msg328017

resolution: not a bug
stage: resolved
2018-10-19 01:58:12Sapscreate