Title: __dict__ = self in subclass of dict causes a memory leak
Type: resource usage Stage: resolved
Components: Interpreter Core Versions: Python 3.2, Python 3.3, Python 2.7
Status: closed Resolution: fixed
Assigned To: Nosy List: Boris.FELD, arigo, dobesv, eric.snow, flox, hitchmanr, hniksic, nnorwitz, python-dev, vstinner
Created on 2006-04-13 05:04 by dobesv, last changed 2022-04-11 14:56 by admin. This issue is now closed.

subtypecleardict.diff arigo, 2006-04-13 09:58 Patch (for 2.5, probably applies to 2.4 as well) review arigo, 2006-04-13 09:59 Another leak. hitchmanr, 2010-03-05 06:18 unit test to check whether "self.__dict__ = self" leaks
msg28217 - (view) Author: Dobes V (dobesv) Date: 2006-04-13 05:04

ActivePython 2.4.2 Build 10 (ActiveState Corp.) based 
Python 2.4.2 (#67, Jan 17 2006, 15:36:03) [MSC v.1310 
32 bit (Intel)] on win32

For reasons I do not understand, the following class 
leaks itself continuously:

class AttrDict(dict):
   def __init__(self, *args, **kw):
      dict.__init__(self, *args, **kw)
      self.__dict__ = self

Whereas this version does not:

class AttrDict(dict):
   def __init__(self, *args, **kw):
      dict.__init__(self, *args, **kw)
   def __getattr__(self, key):
      return self[key]
   def __setattr__(self, key, value):
      self[key] = value

My test looks like this:

for n in xrange(1000000):
   import gc
   ad = AttrDict()
   ad['x'] = n
   ad.y = ad.x
   print n, ad.x, ad.y

And I sit and watch in the windows task manager while 
the process grows and grows.  With the __getattr__ 
version, it doesn't grow.

msg28218 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-04-13 09:58
Logged In: YES 

This is caused by the tp_clear not doing its job -- 
in this case, tp_clear is subtype_clear(), which does
not reset the __dict__ slot to NULL because it assumes
that the __dict__ slot's content itself will be cleared,
which is perfectly true but doesn't help if self.__dict__
is self.

Attached a patch to fix this.  It's kind of hard to
test for this bug because all instances of AttrDict
are really cleared, weakrefs to them are removed,

Also attached is an example showing a similar bug: a
cycle through the ob_type field, with a object U
whose class is itself. It is harder to clear this
link because we cannot just set ob_type to NULL in
subtype_clear.  Ideas welcome...
msg28219 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-14 07:19
Logged In: YES 

Armin, why not just check any leaking test cases into
Lib/test/leakers?  I have no idea if your patch is correct
or not.  I wouldn't be surprised if there are a bunch more
issues similar to this.  What if you stick self in
self.__dict__ (I'm guessing this is ok, but there are a
bunch of variations) or start playing with weakrefs?
msg100453 - (view) Author: Ryan Hitchman (hitchmanr) Date: 2010-03-05 06:18
Attached test case demonstrates the leak.
msg114653 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-08-22 09:22
I've reproduced this problem with 2.7, 3.1 and 3.2.
msg114783 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2010-08-24 12:55
Added the two tests in Lib/test/leakers as r45389 (in 2006) and r84296 (now).
msg155084 - (view) Author: Hrvoje Nikšić (hniksic) * Date: 2012-03-07 12:52
Could this patch please be committed to Python? We have just run into this problem in production, where our own variant of AttrDict was shown to be leaking.

It is possible to work around the problem by implementing explicit __getattr__ and __setattr__, but that is both slower and trickier to get right.
msg155141 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-03-08 00:53
New changeset 3787e896dbe9 by Benjamin Peterson in branch '3.2':
allow cycles throught the __dict__ slot to be cleared (closes #1469629)
msg155144 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-03-08 01:15
New changeset c7623da4e2af by Benjamin Peterson in branch '2.7':
allow cycles throught the __dict__ slot to be cleared (closes #1469629)
