Author nascheme
Recipients Mark.Shannon, christian.heimes, inada.naoki, jdemeyer, lukasz.langa, miss-islington, nascheme, pablogsal, petr.viktorin, pitrou, tim.peters, vstinner
Date 2019-09-29.18:44:17
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1569782658.27.0.976304113848.issue38006@roundup.psfhosted.org>
In-reply-to
Content
Victor:
> I'm surprised that the function would be seen as "unreachable" if
> it's reference counter was equal to 135:

If all those references are contained within that garbage set, it is
possible.


Pablo:
> For the weakref to be handled correctly the ctypedescr needs to be
> identified correctly as part of the isolated cycle. If is not in
> the isolated cycle something may be missing tp_traverse or the GC
> flags. 

I think this analysis is correct.  I did some poking around with
using 'rr'.  It is very handy to run the program in reverse and use
conditional breakpoints based on pointer addresses.

The objects involved seem to be:

CF: cfield
CT: ctypedescr
F: function
W: weakref

They have the following structure:

CF -> CT -> W -> F

The first link the chain is not found by the GC because CF does not
implement tp_traverse.  All those objects are part of garbage kept
alive by the reference cycle and the GC calls delete_garbage() on
it.  CT is not part of 'unreachable' and handle_weakrefs() does not
find W.

What I previously said about tp_clear was incorrect.  We take great
pains to avoid running non-trivial code after calling tp_clear (no
finalizers and no callbacks).  So, the func_clear method should be
safe.  Even if tp_clear doesn't succeed in breaking the cycle, there
should be no way for user Python code to access those objects that
have had tp_clear called on them.  Is gc.get_objects() a hole in
this?  Seems like you could cause crashes with it.

This bug is surprising to me in another way.  If the analysis is
correct then fully implementing the GC protocols is no longer
optional (as it was when cyclic GC was first introduced).  If the GC
cannot find garbage weakrefs handle_weakrefs() doesn't work and we
end up with crashes like this.

Maybe Pablos fix to not call the weakref callback if we are in
delete_garbage() is a good enough band-aid.  The alternative would
seem to be ensuring that tp_traverse methods exist so that every
single reference can be followed by the GC.  I imagine that would be
a huge task and there are many extension types that don't have them.
History
Date User Action Args
2019-09-29 18:44:18naschemesetrecipients: + nascheme, tim.peters, pitrou, vstinner, christian.heimes, petr.viktorin, inada.naoki, lukasz.langa, Mark.Shannon, jdemeyer, pablogsal, miss-islington
2019-09-29 18:44:18naschemesetmessageid: <1569782658.27.0.976304113848.issue38006@roundup.psfhosted.org>
2019-09-29 18:44:18naschemelinkissue38006 messages
2019-09-29 18:44:17naschemecreate