classification
Title: gc.get_referrers() is inherently dangerous
Type: Stage:
Components: Interpreter Core Versions: Python 2.3
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: arigo, brett.cannon, loewis, ods, tim.peters
Priority: normal Keywords:

Created on 2003-08-23 17:17 by arigo, last changed 2003-10-28 12:11 by arigo. This issue is now closed.

Files
File name Uploaded Description Edit
tupletest.py arigo, 2003-08-23 17:25
libgc.diff arigo, 2003-09-02 18:11 documentation patch
Messages (8)
msg17902 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2003-08-23 17:17
gc.get_referrers() can be used to crash any Python
interpreter because it allows the user to obtain
objects which are still under construction.

Here is an example where an iterator can use it to
obtain a tuple while it is still being populated by the
'tuple' built-in function. The following example
triggers a SystemError, but as the tuple 't' is at the
moment still full of null values it can easily generate
segfaults instead.

from gc import get_referrers

def iter():
    tag = object()
    yield tag   # 'tag' gets stored in the result tuple
    lst = [x for x in get_referrers(tag)
           if isinstance(x, tuple)]
    t = lst[0]  # this *is* the result tuple
    print t     # full of nulls !

tuple(iter())

Unless someone has more ideas than me as to how to
prevent this problem, I'd suggest that
gc.get_referrers() should be deemed 'officially
dangerous' in the docs.
msg17903 - (view) Author: Denis S. Otkidach (ods) * Date: 2003-08-28 07:35
Logged In: YES 
user_id=63454

I guess it's dangerous to make object that is not properly 
initialized reachable from other code. Even if it's reachable 
with get_referrers() only. There is no danger in 
get_referrers() itself.
msg17904 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2003-08-28 13:23
Logged In: YES 
user_id=4771

But it would be very difficult to fix the code base to avoid
the problem. The 'tuple' constructor was only an example; it
is actually a quite common pattern everywhere in the C code
base of both the core and extension modules. Expecting an
object not to be seen before you first hand it out is
extremely common, and get_referrers() breaks that
assumption. Hence the claim that the problem really lies in
get_referrers().
msg17905 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-08-31 17:18
Logged In: YES 
user_id=21627

I see no harm in your example. Even though the tuple has
NULLs in it, is in a stable state. The lesson learned is
that an object should become gc-tracked only if it is in a
stable state, as gc-tracking means to expose references to
the object. This is true independent of get_referrers, as
gc-tracking means that tp_traverse might be called for the
object, so it *has* to be in a stable state.

I fail to see how the example "crashes" the Python
interpreter. I causes a SystemError when the tuple is
resized, that's all. There are many ways to cause a
SystemError, including

raise SystemError

I recommend not to declare a function as "dangerous" in the
docs. Instead, the actual problem should be explained, both
in the GC part of the C API, and in gc module docs (for both
get_referrers, and get_objects).
msg17906 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2003-08-31 23:36
Logged In: YES 
user_id=31435

Martin, it's easy to transform the example into one that 
crashes.  For example, adding "print t[3]" as the last line of 
the iter() function causes it to die with a segfault on my box.  
Virtually anything that fetches a NULL from the tuple will try 
to Py_INCREF it (that's all "t[3]" does), and that's an instant 
NULL-pointer dereferencing death.

That said, it would be more helpful <wink> if Armin submitted 
a doc patch.  The introspective features of the gc module are 
intended to be used by consenting adults facing hard 
debugging problems, and I don't care if they can be tricked 
into blowing up.  I agree the docs should point out the 
possibility, though.
msg17907 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2003-09-02 18:11
Logged In: YES 
user_id=4771

Here is the doc patch (attached).

A clean solution to this problem would involve delaying GC
registration, which I think would imply changes in the C
API. It is probably not worth it. I don't think it would be
a problem for the GC not to be able to browse through
objects still under constructions because such objects are
never part of a dead cycle anyway, and probably not part of
a cycle at all until later mutated.
msg17908 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2003-09-06 22:51
Logged In: YES 
user_id=357491

The wording Armin proposes in his patch makes sense to me.
msg17909 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2003-10-28 12:11
Logged In: YES 
user_id=4771

Checked in:

Doc/lib/libgc.tex (rev: 1.15)
History
Date User Action Args
2003-08-23 17:17:54arigocreate