New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OrderedDict has strange behaviour when dict.__setitem__ is used. #68914
Comments
Setting an item in an ordered dict via dict.__setitem__, or by using it as an object dictionary and setting an attribute on that object, creates a dictionary whose repr is: OrderedDict([<NULL>]) Test case attached. |
Linking related issues http://bugs.python.org/issue24721 http://bugs.python.org/issue24667 and http://bugs.python.org/issue24685 |
Attached revised file that runs to completion on 2.7 and 3.x. |
There is a bug in _PyObject_GenericSetAttrWithDict() Objects/object.c where a calls are made to PyDict_SetItem() and PyDict_DelItem() without checking first checking for PyDict_CheckExact().
With pure python code for the subclass, we say, "don't do that". I'll add a note to that effect in the docs for the OD (that said, it is a general rule that applies to all subclasses that have to stay synchronized to state in the parent). In C version of the OD subclass, we still can't avoid being bypassed (see http://bugs.python.org/issue10977) and having our subclass invariants violated. Though the C code can't prevent the invariants from being scrambled it does have an obligation to not segfault and to not leak something like "OrderedDict([<NULL>])". Ideally, if is possible to detect an invalid state (i.e. the linked link being out of sync with the inherited dict), then a RuntimeError or somesuch should be raised. |
FTR, this will likely involve more than just fixing odict_repr(). |
__repr__() allocates a list with the size len(od) and fills it iterating linked list. If the size of linked list is less then the size of the dict, the rest of the list is not initialized. Even worse things happened when the size of linked list is greater then the size of the dict. Following example causes a crash: from collections import OrderedDict
od = OrderedDict()
class K(str):
def __hash__(self):
return 1 od[K('a')] = 1 Proposed patch fixes both issues. |
Ping. |
Review posted. Aside from a couple minor comments, LGTM. Thanks for doing this. Incidentally, it should be possible to auto-detect independent changes to the underlying dict and sync the odict with those changes. However, doing so likely isn't worth it. |
New changeset 88d97cd99d16 by Serhiy Storchaka in branch '3.5': New changeset 965109e81ffa by Serhiy Storchaka in branch 'default': |
Thanks for your review Eric. test_delitem_2 was not added because it fails in just added TestCase for COrderedDict subclass. Added tests for direct calls of other dict methods as Eric suggested. During writing new tests for direct calls of other dict methods I found yet one bug. Following code makes Python to hang and eat memory. from collections import OrderedDict
od = OrderedDict()
for i in range(10):
od[str(i)] = i
for i in range(9):
dict.__delitem__(od, str(i))
list(od) |
New changeset 1594c23d8c2f by Serhiy Storchaka in branch '3.5': New changeset b391e97ccfe5 by Serhiy Storchaka in branch 'default': |
Wrong issue. The correct one is bpo-25410. |
Here is a patch that fixes an infinite loop reported in msg254071. May be this is not the best solution. It makes the behavior of Python and C implementation differ (the former just iterates a linked list, the latter raises an error). But to reproduce Python implementation behavior we need to add refcounters to linked list nodes. |
There may still be some holes still remaining in OrderedDict but it doesn't seem to have been relevant in practice and will become even less so now that regular dicts are ordered and compact. If an issue does are arise with someone setting OrderedDict values via dict.__setitem__ we should probably just document "don't do that" rather than performing brain surgery on the current implementation which was known in advance to be vulnerable to exactly this sort of trickery. If there are no objections, I recommend closing this as out-of-date. IMO this would be better than risking introducing new problems are getting the C version further out of sync with the Python version or altering how existing code is working. |
See also: https://bugs.python.org/msg131551 |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: