classification
Title: Remove usage of UserDict from os.py
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 2.6
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: arigo, belopolsky, christian.heimes, dalke, loewis, rhettinger, tds333
Priority: low Keywords: easy, patch

Created on 2005-11-27 20:18 by tds333, last changed 2008-02-24 04:22 by loewis. This issue is now closed.

Files
File name Uploaded Description Edit
os.diff tds333, 2005-11-27 20:18 patch for os.py uses dict instead of UserDict
Messages (15)
msg49130 - (view) Author: Wolfgang Langner (tds333) * Date: 2005-11-27 20:18
This patch removes UserDict usage from os.py.
It uses the new dict base class instead of UserDict.

Patch was generated against Revision 41544
of python subersion repository.
msg49131 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2005-12-29 18:02
Logged In: YES 
user_id=4771

Subclassing 'dict' to modify some behavior only works more
or less in CPython.  There are a lot of (admitedly
convoluted) ways to get unexpected effects, where the
original methods are called instead of the overridden ones.
And it's not future-proof: if new methods are added to 'dict'
in the future, say a merge() similar to update() but not
replacing existing keys, then they will need to be added to
that subclass as well, or the method will misbehave.  The
advantage of UserDict is to guard against all these problems.

For example, with the patch:

        exec "global a; a=5" in os.environ

stores the key 'a' directly in os.environ, bypassing
the custom __setitem__().  With UserDict instead, we
get an explicit error.  This is more a joke, but the
new-methods-appearing-later problem is more serious
IMHO.
msg49132 - (view) Author: Wolfgang Langner (tds333) * Date: 2006-01-16 20:13
Logged In: YES 
user_id=600792

The intend was to remove dependency on UserDict. Less 
imports for a minimal python. If there are general problems 
with the inheritence from dict than they should be 
documented. I thought in future UserDict will gone.
I investigate this problems.
msg49133 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2006-11-19 23:01
I believe the patch is wrong: it does not support setdefault() (which the current code does).

It would be good if there were test cases for os.environ that made sure all dictionary methods had the correct effect on the environment.
msg59804 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-01-12 08:02
I don't think the point of this patch is worth doing.  The existing 
code is not buggy and runs fine.  Subclassing from dict increases the 
likelihood that an error will be introduced either now or in the future 
as the dict object evolves.  Recommend this be closed.
msg59814 - (view) Author: Andrew Dalke (dalke) * (Python committer) Date: 2008-01-12 11:30
I was optimization tuning and wondered why UserDict was imported by os.  Replacing 
UserDict with dict passes all existing regression tests.

I see the concerns that doing that replacement is not future proof.  Strange then 
that Cookie.py is acceptable.  There are three places in Lib which derive from dict, 
and two are in Cookie.py and in both cases it's broken because set_default does not 
go through the same checks that __setitem__ goes through.

(The other place is an internal class in _strptime.)

In looking over existing third-party code, I see this nuance of when to use UserDict 
vs. dict isn't that well known.  The documentation says "The need for this class has 
been largely supplanted by the ability to subclass directly from dict", but that 
isn't true if anyone is worried about future-proofing and where the subclass changes 
one of the standard methods.
msg59815 - (view) Author: Andrew Dalke (dalke) * (Python committer) Date: 2008-01-12 12:27
I should have added my preference.  I would like to see UserDict replaced with 
dict.  I didn't like seeing the extra import when I was doing my performance 
testing, through truthfully it's not a bit overhead.

As for future-proofing, of course when there's a change in a base class then 
there can be problems with derived classes.  When that happens, change all of 
the affected classes in the code base, and make sure to publish the change so 
third parties know about it.

Yes, there's a subtlety here that most people don't know about.  But it's not 
going to go away.

As for the evil that is 'exec':

  exec "locals().data['MACHTYPE']=1; print MACHTYPE" in {}, os.environ

gives me another way to mess things up.

A point of unit tests is to allow changes like this without worry about code 
breakage.  And it's not like other non-buggy code wasn't updated over time to 
reflect changing style and best practices.

If it's not compatible with Jython or IronPython or PyPy then ignore what I 
said, but fix Cookie and update the docs to make that clear as people do think 
that it's better to derived from dict for things like this than to derive from 
UserDict or UserDictMixin.

I can give a lightning talk about this at PyCon.  :)
msg59822 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-01-12 16:26
> it's better to derived from dict for things like
> this than to derive from UserDict or UserDictMixin

That's true for UserDict but not DictMixin.  IIRC, 
the latter will continue to exist in Py3.0 and it
has a purpose that is much different than UserDict
or dict, namely to build-out objects with a mapping
interface but are *not* dicts under the hood.
msg59853 - (view) Author: Andrew Dalke (dalke) * (Python committer) Date: 2008-01-13 13:08
Ahh, so the bug here that the environ dict should use neither UserDict nor 
dict, it should implement the core {get,set,del}item and keys and use 
DictMixin.

Martin mentioned that the patch doesn't support setdefault.  He didn't note 
though that the current code also doesn't support the dictionary interface 
consistently.  This shows a problem with popitem.

>>> import os
>>> os.environ["USER"]
'dalke'
>>> os.environ["USER"] = "nobody"
>>> os.system("echo $USER")
nobody
0
>>> del os.environ["USER"]
>>> os.system("echo $USER")

0
>>> os.environ["USER"] = "dalke"
>>> while os.environ: print os.environ.popitem()
... 
('GROUP', 'staff')
('XDG_DATA_HOME', '/Users/dalke/.local/share')
('TERM_PROGRAM_VERSION', '133')
('CVS_RSH', 'ssh')
('LOGNAME', 'dalke')
('USER', 'dalke')
    ... removed for conciseness ...
('QTDIR', '/usr/local/qt')
>>> os.system("echo $USER")
dalke
0
>>> 

Not enough people know about DictMixin.
msg62841 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-02-23 23:22
I would say dict's failure to call overloaded __setitem__ from 
setdefault is a dict implementation problem which should be either 
fixed, or clearly documented as warning to anyone who wants to derive 
from dict.

A fix would be trivial:

===================================================================
--- Objects/dictobject.c        (revision 61014)
+++ Objects/dictobject.c        (working copy)
@@ -1861,7 +1861,7 @@
        val = ep->me_value;
        if (val == NULL) {
                val = failobj;
-               if (PyDict_SetItem((PyObject*)mp, key, failobj))
+               if (PyObject_SetItem(mp, key, failobj))
                        val = NULL;
        }
        Py_XINCREF(val);

but I have no clue what performance or other implications would be.

Maybe something like this could be considered for Py3k - reviewing dict 
implementation to make it usable as DictMixin.  I'll write a complete 
patch if there is positive reaction.
msg62850 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-02-24 00:07
It's intentional that dict doesn't dynamically bind any calls; to
properly replace the semantics of dict, you have to implement *all* API.
There might be lack of documentation of this aspect, however, that is a
separate issue from the one being discussed here.
msg62862 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-02-24 01:38
Recommend closing this one.  There is no point in going on the warpath 
against UserDict.  There are downsides to changing it and almost no 
advantage (saving an import, that's it).  The primary motivation for 
this bug report is the OP's distaste for UserDict.  There doesn't seem 
to be any actual bug or measured performance issue.
msg62867 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2008-02-24 02:14
See comments at issue2144.  Benjamin Peterson demonstrated a more than 2x 
speedup on a micro-benchmark.  Plus, the fact that two people were 
motivated enough to independently produce a near-complete patch is worth 
something.
msg62868 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-02-24 02:34
I think some variant of these patches was rejected before and part of 
these reason had to do with subtle semantic changes when switching from 
an old-style class (inheriting from UserDict) to a new-style class 
(inheriting from dict).  There were also some concerns that it could 
break code currently relying on isinstance(x, UserDict).

IMO, this just isn't work it.  The os.environ use cases are not 
typically the performance critical part of a script.  This is a false 
optimization.  The primary motivation seems to be that the OP doesn't 
like UserDict.

Looking at the patch, I find the existing code to be more self-
evidently correct and maintainable than the proposed new code.  Though 
this is probably a matter of taste.
msg62874 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-02-24 04:22
I'm rejecting this patch, for the same reasons as issue2144. If further
discussion is needed, please discuss on python-dev.
History
Date User Action Args
2008-02-24 04:22:37loewissetstatus: open -> closed
resolution: rejected
messages: + msg62874
2008-02-24 02:34:01rhettingersetmessages: + msg62868
2008-02-24 02:14:55belopolskysetmessages: + msg62867
2008-02-24 01:38:58rhettingersetmessages: + msg62862
2008-02-24 00:07:28loewissetmessages: + msg62850
2008-02-23 23:22:36belopolskysetnosy: + belopolsky
messages: + msg62841
2008-01-13 13:08:47dalkesetmessages: + msg59853
2008-01-12 16:26:31rhettingersetmessages: + msg59822
2008-01-12 12:27:45dalkesetmessages: + msg59815
2008-01-12 11:30:50dalkesetnosy: + dalke
messages: + msg59814
2008-01-12 08:02:20rhettingersetnosy: + rhettinger
messages: + msg59804
2008-01-12 05:27:24christian.heimessetkeywords: + easy
assignee: christian.heimes ->
2008-01-12 05:27:13christian.heimessetassignee: christian.heimes
versions: + Python 2.6, - Python 2.5
type: enhancement
nosy: + christian.heimes
2005-11-27 20:18:34tds333create