This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Make it clear in the collections Python source code that OrderedDict may be overridden
Type: behavior Stage: needs patch
Components: Documentation Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: abarry, eric.snow, lig, r.david.murray, rhettinger, zach.ware
Priority: low Keywords:

Created on 2015-10-05 02:03 by lig, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (16)
msg252296 - (view) Author: Serge Matveenko (lig) * Date: 2015-10-05 02:03
Consider this code in Python 3.5.0:

Python 3.5.0 (default, Sep 26 2015, 14:59:25) 
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from collections import OrderedDict
>>> class MyDict(OrderedDict):
...     def __setitem__(self, *args, **kwargs):
...         print(self._OrderedDict__root)
...         OrderedDict.__setitem__(self, *args, **kwargs)
... 
>>> md = MyDict({'a': 1})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in __setitem__
AttributeError: 'MyDict' object has no attribute '_OrderedDict__root'


However in Python 3.4:

Python 3.4.2 (default, Jul  9 2015, 17:24:30) 
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from collections import OrderedDict
>>> class MyDict(OrderedDict):
...     def __setitem__(self, *args, **kwargs):
...         print(self._OrderedDict__root)
...         OrderedDict.__setitem__(self, *args, **kwargs)
... 
>>> md = MyDict({'a': 1})
<collections._Link object at 0x7f0398a79b88>
msg252298 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2015-10-05 02:28
This is a side-effect of 3.5 having a C implementation of OrderedDict, and will not be fixed.

In the standard library, even just having a single trailing underscore is a clear indication of "if you use this, expect it to break without warning at any time."  Mangled names in the standard library are more of an indication of "you *really* shouldn't use this".
msg252310 - (view) Author: Serge Matveenko (lig) * Date: 2015-10-05 08:19
Zach, ok I got your point. But there is Python implementation in the library still which does not conform the one done in C. Such a behavior is tremendously confusing.

If there is both Python and C implementation in the library they must conform each other. And there is nothing with "you *really* shouldn't use this" when you need to especially when it is there. If it is there it must work as it written. Otherwise why Python does have this peace of code while it is completely useless as reference?
msg252311 - (view) Author: Serge Matveenko (lig) * Date: 2015-10-05 08:31
If Python source does conform the one in C, it will be completely fine and understandable to have such a change and to rely on it in using version checks or whatever.
msg252318 - (view) Author: Anilyka Barry (abarry) * (Python triager) Date: 2015-10-05 11:36
The same is true of the decimal and datetime modules. Names starting with an underscore are an implementation detail, and you shouldn't rely on these in production code.
msg252328 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-05 14:19
A method that starts with an '_' is not part of the API unless documented to be so (as with namedtuple), and a non-special method that starts with two is purposefully mangled so that you cannot accidentally rely on it.  The C implementation is an implementation of the API and the behavior when that API is used (as vetted by the tests that are run against both implementations); it is not otherwise required to "conform" to the python implementation or vice versa.
msg252352 - (view) Author: Serge Matveenko (lig) * Date: 2015-10-05 17:45
Ok, then the OrderedDict is useless for any advanced hacking like altering the order in which __setitem__ stores items.

It is just useless Python code and so much appliances missed for this class:(

Thank you all for these completely helpful answers.
msg252356 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-05 18:31
Yes, that would be correct.  OrderedDict is designed to keep things in insertion order, so supporting changing that order is not part of its mission.  Sorry about that, you'll have to write your own class.

Now, Raymond might find the idea of supporting other orders interesting.  If so I'm sure he'll comment on the issue eventually.
msg252466 - (view) Author: Serge Matveenko (lig) * Date: 2015-10-07 11:49
Sorry for reopening. I completely agree with the point that is it not necessary for Python and C implementations to duplicate each other.

But then the Python OrderedDict implementation should be dropped from the library source. The code that is there now is nothing more than confusing. For some one not aware of C implementation it is very hard to get from documentation that the code she sees via "Go to definition" feature of her IDE is not meant to be executing at all.
msg252473 - (view) Author: Anilyka Barry (abarry) * (Python triager) Date: 2015-10-07 13:31
The Python implementation will stay for two main reasons, one to provide a working implementation for all those who wish to use a modified version (you, for example, if you want to use a version that lets you alter order), and two for alternate implementations which don't have OrderedDict in the _collections module (a comment even hints at that in the source).

I fully agree on the confusing part, however. Maybe a comment could be added regarding this.
msg252474 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-07 15:05
This is a general policy: we have python modules for as many stdlib modules as we can (which is most of them), and for some stdlib modules we also have C accelerators.

I will agree that the fact that the documentation has a link to the source code of the collections module, which includes the OrderedDict python source, can be misleading if you do not understand the implication of the import at the end.  So adding a comment at the top of the OrderedDict source would, I think, be reasonable.
msg252493 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-10-08 00:27
In general, the __attributes and _attributes are already considered to be private, so I don't see the point.  Aside from this one user that is never been necessary and he already knows that it doesn't work. 

Re-reading the thread, the issue isn't that he doesn't like the design choice, not that he didn't understand it.

The collections module isn't the only place the implementation has been made private (see fractions.py) for example.   Maybe a general note in the docs about private APIs or a FAQ entry would be appropriate, but there isn't anything specific to OrderedDicts (especially now that there is a C implementation, it should go without saying).
msg252494 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-10-08 00:28
FWIW, the decision to use __ to make the implementation more private was done at Guido's recommendation.
msg252495 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-10-08 00:37
One other thought:  The purpose of the source links is to help people learn.  It in no way every implies that someone should hack private implementation details.   We have a number of modules that have both a pure python implementation and a replacement with a C accelerator (heapq, counter, bisect, decimal, etc) they are all done in the same way as the OrderedDict.

I believe this report should be re-closed as the OP has consistently adopted unreasonable positions as has been unwilling to accept the responses from the devs volunteering their time to explain the basics of private, non-guaranteed implementation details.
msg252496 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-10-08 01:06
What I'm suggesting it would be worth adding to the source code is a simple comment before the class definition for OrderedDict that says "This Python code may be overridden by an accelerated version of this class."  The idea being to prevent confusion if someone is exploring the source code for learning purposes but discovers it doesn't match what python actually does.

If you read all the way through the class and realize that the import after it does the override, you don't need the comment...but a learner exploring the source code is exactly the kind of person who might miss that, either by doing experiments before reading through the whole class or missing the implication of the import.
msg252497 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-10-08 01:18
> What I'm suggesting it would be worth adding to the source code is
> a simple comment before the class definition for OrderedDict that
> says  "This Python code may be overridden by an accelerated 
> version of this class." 

Sorry, I disagree.  We've never had to do that for any of the C accelerator code in Python's history.  Also, we already have a comment at the point the import is done (and that didn't seem to matter to the OP who is just mad about the design choice itself).  The comment is in the same place as it is in heapq.py for example.  If you want to add a general FAQ entry, that would be fine.  Also, keep in mind that that the code is tested in our test suite, and it is active in other versions of Python (it is not useless as the OP suggests).
History
Date User Action Args
2022-04-11 14:58:22adminsetgithub: 69502
2015-10-09 04:04:46rhettingersetstatus: open -> closed
resolution: not a bug
2015-10-08 01:18:25rhettingersetmessages: + msg252497
2015-10-08 01:06:11r.david.murraysetmessages: + msg252496
2015-10-08 00:37:51rhettingersetmessages: + msg252495
2015-10-08 00:28:58rhettingersetmessages: + msg252494
2015-10-08 00:27:50rhettingersetmessages: + msg252493
2015-10-07 15:07:42r.david.murraysettitle: Make it clear in the OrderedDict source that it may be overridden -> Make it clear in the collections Python source code that OrderedDict may be overridden
2015-10-07 15:06:40r.david.murraysettitle: OrderedDict mangled private attribute is inaccessible -> Make it clear in the OrderedDict source that it may be overridden
2015-10-07 15:05:48r.david.murraysetstage: resolved -> needs patch
messages: + msg252474
components: + Documentation, - Library (Lib)
versions: + Python 3.6
2015-10-07 13:31:56abarrysetmessages: + msg252473
2015-10-07 11:49:20ligsetstatus: closed -> open
resolution: not a bug -> (no value)
messages: + msg252466
2015-10-05 18:31:37r.david.murraysetmessages: + msg252356
2015-10-05 17:45:08ligsetmessages: + msg252352
2015-10-05 14:19:16r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg252328

resolution: not a bug
2015-10-05 11:36:16abarrysetnosy: + abarry
messages: + msg252318
2015-10-05 09:40:15serhiy.storchakasetpriority: normal -> low
assignee: rhettinger

nosy: + rhettinger
2015-10-05 08:31:28ligsetmessages: + msg252311
2015-10-05 08:19:34ligsetstatus: closed -> open
resolution: wont fix -> (no value)
messages: + msg252310
2015-10-05 02:28:33zach.waresetstatus: open -> closed

nosy: + eric.snow, zach.ware
messages: + msg252298

resolution: wont fix
stage: resolved
2015-10-05 02:03:07ligcreate