classification
Title: __dict__ = self in subclass of dict causes a memory leak
Type: resource usage Stage: resolved
Components: Interpreter Core Versions: Python 3.3, Python 3.2, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Boris.FELD, arigo, dobesv, eric.snow, flox, haypo, hitchmanr, hniksic, nnorwitz, python-dev
Priority: high Keywords:

Created on 2006-04-13 05:04 by dobesv, last changed 2012-03-09 08:55 by eric.snow. This issue is now closed.

Files
File name Uploaded Description Edit
subtypecleardict.diff arigo, 2006-04-13 09:58 Patch (for 2.5, probably applies to 2.4 as well) review
test4.py arigo, 2006-04-13 09:59 Another leak.
test_dictself.py hitchmanr, 2010-03-05 06:18 unit test to check whether "self.__dict__ = self" leaks
Messages (9)
msg28217 - (view) Author: Dobes V (dobesv) Date: 2006-04-13 05:04
Using:

ActivePython 2.4.2 Build 10 (ActiveState Corp.) based 
on
Python 2.4.2 (#67, Jan 17 2006, 15:36:03) [MSC v.1310 
32 bit (Intel)] on win32

For reasons I do not understand, the following class 
leaks itself continuously:

class AttrDict(dict):
   def __init__(self, *args, **kw):
      dict.__init__(self, *args, **kw)
      self.__dict__ = self

Whereas this version does not:

class AttrDict(dict):
   def __init__(self, *args, **kw):
      dict.__init__(self, *args, **kw)
   
   def __getattr__(self, key):
      return self[key]
   
   def __setattr__(self, key, value):
      self[key] = value

My test looks like this:

for n in xrange(1000000):
   import gc
   gc.collect()
   ad = AttrDict()
   ad['x'] = n
   ad.y = ad.x
   print n, ad.x, ad.y

And I sit and watch in the windows task manager while 
the process grows and grows.  With the __getattr__ 
version, it doesn't grow.


msg28218 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-04-13 09:58
Logged In: YES 
user_id=4771

This is caused by the tp_clear not doing its job -- 
in this case, tp_clear is subtype_clear(), which does
not reset the __dict__ slot to NULL because it assumes
that the __dict__ slot's content itself will be cleared,
which is perfectly true but doesn't help if self.__dict__
is self.

Attached a patch to fix this.  It's kind of hard to
test for this bug because all instances of AttrDict
are really cleared, weakrefs to them are removed,
etc.

Also attached is an example showing a similar bug: a
cycle through the ob_type field, with a object U
whose class is itself. It is harder to clear this
link because we cannot just set ob_type to NULL in
subtype_clear.  Ideas welcome...
msg28219 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-14 07:19
Logged In: YES 
user_id=33168

Armin, why not just check any leaking test cases into
Lib/test/leakers?  I have no idea if your patch is correct
or not.  I wouldn't be surprised if there are a bunch more
issues similar to this.  What if you stick self in
self.__dict__ (I'm guessing this is ok, but there are a
bunch of variations) or start playing with weakrefs?
msg100453 - (view) Author: Ryan Hitchman (hitchmanr) Date: 2010-03-05 06:18
Attached test case demonstrates the leak.
msg114653 - (view) Author: Mark Lawrence (BreamoreBoy) Date: 2010-08-22 09:22
I've reproduced this problem with 2.7, 3.1 and 3.2.
msg114783 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2010-08-24 12:55
Added the two tests in Lib/test/leakers as r45389 (in 2006) and r84296 (now).
msg155084 - (view) Author: Hrvoje Nikšić (hniksic) Date: 2012-03-07 12:52
Could this patch please be committed to Python? We have just run into this problem in production, where our own variant of AttrDict was shown to be leaking.

It is possible to work around the problem by implementing explicit __getattr__ and __setattr__, but that is both slower and trickier to get right.
msg155141 - (view) Author: Roundup Robot (python-dev) Date: 2012-03-08 00:53
New changeset 3787e896dbe9 by Benjamin Peterson in branch '3.2':
allow cycles throught the __dict__ slot to be cleared (closes #1469629)
http://hg.python.org/cpython/rev/3787e896dbe9
msg155144 - (view) Author: Roundup Robot (python-dev) Date: 2012-03-08 01:15
New changeset c7623da4e2af by Benjamin Peterson in branch '2.7':
allow cycles throught the __dict__ slot to be cleared (closes #1469629)
http://hg.python.org/cpython/rev/c7623da4e2af
History
Date User Action Args
2012-03-09 08:55:08eric.snowsetnosy: + eric.snow
2012-03-08 01:15:42python-devsetmessages: + msg155144
2012-03-08 00:53:07python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg155141

resolution: fixed
stage: patch review -> resolved
2012-03-07 23:02:44pitrousetnosy: + haypo
2012-03-07 12:59:27floxsetnosy: + flox

title: __dict__ = self in subclass of dict causes a memory leak? -> __dict__ = self in subclass of dict causes a memory leak
2012-03-07 12:52:29hniksicsetnosy: + hniksic
messages: + msg155084
2011-11-03 09:59:00Boris.FELDsetnosy: + Boris.FELD
2011-06-12 18:47:50terry.reedysetnosy: - BreamoreBoy

versions: + Python 3.3, - Python 3.1
2010-09-18 15:47:00benjamin.petersonlinkissue1441 superseder
2010-08-24 12:55:07arigosetmessages: + msg114783
2010-08-22 09:22:08BreamoreBoysetversions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6
nosy: + BreamoreBoy

messages: + msg114653

type: behavior -> resource usage
stage: test needed -> patch review
2010-03-05 06:18:21hitchmanrsetfiles: + test_dictself.py
nosy: + hitchmanr
messages: + msg100453

2009-03-21 02:07:25rhettingersetpriority: normal -> high
2009-03-21 02:03:52ajaksu2setstage: test needed
type: behavior
components: + Interpreter Core, - None
versions: + Python 2.6, - Python 2.4
2006-04-13 05:04:48dobesvcreate