This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author pablogsal
Recipients corona10, pablogsal, serhiy.storchaka, shihai1991, tim.peters, vstinner
Date 2020-04-08.00:03:48
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1586304228.76.0.881261217097.issue40217@roundup.psfhosted.org>
In-reply-to
Content
I have made some investigation, and I think a form of this bug was there from a long time and does not really relate to heap types.
For instance consider this code on Python3.7 (so commit 364f0b0f19cc3f0d5e63f571ec9163cf41c62958 is not present). 

If you consider the more simple heap type:

>>> class A:
...    ...
... 
>>> A().__class__
<class '__main__.A'>

As the class instance refers to its type, it must visit it in the gc and we can see that indeed, that is the case:

>>> import gc
>>> gc.get_referents(A())
[<class '__main__.A'>]

But for instance, consider ast.AST, which in 3.7 is a static type created with PyType_Ready:


>>> import ast
>>> x = ast.AST()
>>> x.__class__
<class '_ast.AST'>

we can see that the object refers to its type as the other one but....oh no, the gc does not know about that link:

>>> import gc
>>> gc.get_referents(x)
[]

This is because its traverse function does not visit the type:

static int
ast_traverse(AST_object *self, visitproc visit, void *arg)
{
     Py_VISIT(self->dict);
     return 0;
}


This is not a problem if the type is *static* because __although is technically an error__ because the GC cannot resolve
cycles that go through the type, as the type is eternal those cycles will never be collected. 

The problem appears when the type is not eternal and can be destroyed. For instance, to understand this consider a function
in the _testcapi that creates a heap type using PyType_FromSpec:

>>> import gc, _testcapi
>>> import weakref
>>> import sys

# Create a new heap type with custom traverse function
>>> x = _testcapi.construct_new_gc_type()
>>> sys.getrefcount(x)
5
>>> new_obj = x()
>>> sys.getrefcount(x)
6
# We know that now there should be a link between new_obj and x because one points to the other
>>> x in gc.get_referents(new_obj)
False
# Oh no, there is no link
>>> class A:
...    def __del__(self):
...       print("Ouch")
... 
>>> x_w = weakref.ref(x)
# Create a reference cycle  a -> new_obj -> x -> a
>>> a = A()
>>> 
>>> x.a = a
>>> a.y = new_obj
>>> a.x = x
>>> gc.collect()
0
>>> del x,a
>>> gc.collect()
0
>>> sys.getrefcount(x_w())
6
>>> del new_obj
# At this point all variables are gone and the collector should clean everything
>>> gc.collect()
0
# Oh, no! The type is still alive
>>> sys.getrefcount(x_w())
6


If we do the same experiment after PR19314:

>>> import sys, gc, _testcapi
>>> import weakref
>>> x = _testcapi.construct_new_gc_type()
>>> sys.getrefcount(x)
5
>>> new_obj = x()
>>> sys.getrefcount(x)
6
# We know that now there should be a link between new_obj and x because one points to the other
>>> x in gc.get_referents(new_obj)
True
# Good!
>>> class A:
...   def __del__(self):
...      print("Ouch")
... 

>>> x_w = weakref.ref(x)
# Create a reference cycle  a -> new_obj -> x -> a
>>> a = A()
>>> x.a = a
>>> a.y = new_obj
>>> a.x = x
>>> gc.collect()
Traversed!
Traversed!
36
>>> del x,a
>>> gc.collect()
Traversed!
Traversed!
0
>>> sys.getrefcount(x_w())
6
>>> del new_obj
# At this point all variables are gone and the collector should clean everything
>>> gc.collect()
Ouch
8
# Nice, the collector cleaned the cycle


-------

So the conclusion:

* This bug affects all types but is only really relevant for types that are not eternal (because eternal types are already "leaked").
* The only real problem and leaks will appear for heap types that are not eternal with custom traverse functions.
* After commit 364f0b0f19cc3f0d5e63f571ec9163cf41c62958 now all heap types that are not eternal, for instance 
  the ones created with PyType_FromSpec, need to traverse the type because they own it. Falining to do this can create leaks in the GC.
* The *correct* thing to do is modify the tp_traverse of all non-eternal heap types, but sadly the only way to do this in a backwards-compatible
  without modifying all user functions is injecting automatically the behaviour as PR 19414 is doing.
History
Date User Action Args
2020-04-08 00:03:48pablogsalsetrecipients: + pablogsal, tim.peters, vstinner, serhiy.storchaka, corona10, shihai1991
2020-04-08 00:03:48pablogsalsetmessageid: <1586304228.76.0.881261217097.issue40217@roundup.psfhosted.org>
2020-04-08 00:03:48pablogsallinkissue40217 messages
2020-04-08 00:03:48pablogsalcreate