This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author asterite
Recipients asterite
Date 2017-05-03.16:44:01
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1493829842.08.0.612214340964.issue30260@psf.upfronthosting.co.za>
In-reply-to
Content
I faced a crash during iterpreter shutdown (Py_Finalize), in final garbage collection phase.

The reason, as far as I could understand, was the following:

In Python 3, when an unclosed socket is destroyed, it issues a ResourceWarning, which contains a string representation of socket object (obtained by __repr__).

If socket class object and unclosed socket object (instance) are in a reference cycle, GC may collect socket class object before it collects an instance when dealing with that cycle. Then, when instance is collected, sock_dealloc() will be called, which will call PyErr_WarnFormat with %R format specifier, which triggers a call to __repr__:

static void
sock_dealloc(PySocketSockObject *s)
{
    if (s->sock_fd != INVALID_SOCKET) {
        // ...
        if (PyErr_WarnFormat(PyExc_ResourceWarning, 1,
                             "unclosed %R", s))
        //...
}

At this moment, socket class object was already deallocated, so was __repr__, resides in it. Particularly, in my tests, when __repr__ attempted to access a global variable (getattr), it tried to take it's name from co_names tuple in it's code object. This tuple was deallocated, and co_names pointed to tuple free list. Trying to use a tuple from free list as a dict key resulted in a crash (more detailed stack trace will be provided in attachment):

# gdb python3.5 test.py
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
...
Starting program: /usr/local/bin/python3.5 test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
test

Program received signal SIGSEGV, Segmentation fault.
0x000000000048998c in PyObject_Hash (v=0x0) at Objects/object.c:761
761         PyTypeObject *tp = Py_TYPE(v);
(gdb) bt
#0  0x000000000048998c in PyObject_Hash (v=0x0) at Objects/object.c:761
#1  0x0000000000498d96 in tuplehash (v=0x7ffff7f39948) at Objects/tupleobject.c:351
#2  0x00000000004899b2 in PyObject_Hash (v=0x7ffff7f39948) at Objects/object.c:763
... tuplehash and PyObject_Hash here also ...
#34 0x00000000004899b2 in PyObject_Hash (v=0x7ffff6801708) at Objects/object.c:763
#35 0x0000000000498d96 in tuplehash (v=0x7ffff6812ca8) at Objects/tupleobject.c:351
#36 0x00000000004899b2 in PyObject_Hash (v=0x7ffff6812ca8) at Objects/object.c:763
#37 0x000000000047a056 in PyDict_GetItemWithError (op=0x7ffff7e83608, key=0x7ffff6812ca8) at Objects/dictobject.c:1143
#38 0x000000000047a1e8 in _PyDict_LoadGlobal (globals=0x7ffff7e83608, builtins=0x7ffff6ea8a88, key=0x7ffff6812ca8) at Objects/dictobject.c:1192
#39 0x0000000000509117 in PyEval_EvalFrameEx (f=0xa196d8, throwflag=0) at Python/ceval.c:2408
#40 0x0000000000511d0b in _PyEval_EvalCodeWithName (_co=0x7ffff7e85270, globals=0x7ffff7e83608, locals=0x0, args=0x7ffff7f07c68, argcount=1, kws=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4071
#41 0x0000000000511e00 in PyEval_EvalCodeEx (_co=0x7ffff7e85270, globals=0x7ffff7e83608, locals=0x0, args=0x7ffff7f07c68, argcount=1, kws=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:4092
#42 0x00000000005db41c in function_call (func=0x7ffff683ec80, arg=0x7ffff7f07c50, kw=0x0) at Objects/funcobject.c:627
#43 0x000000000043ae8f in PyObject_Call (func=0x7ffff683ec80, arg=0x7ffff7f07c50, kw=0x0) at Objects/abstract.c:2166
#44 0x00000000005cb13c in method_call (func=0x7ffff683ec80, arg=0x7ffff7f07c50, kw=0x0) at Objects/classobject.c:330
#45 0x000000000043ae8f in PyObject_Call (func=0x7ffff7eaca48, arg=0x7ffff7fa8048, kw=0x0) at Objects/abstract.c:2166
#46 0x00000000005138a9 in PyEval_CallObjectWithKeywords (func=0x7ffff7eaca48, arg=0x7ffff7fa8048, kw=0x0) at Python/ceval.c:4633
#47 0x00000000004abf3e in slot_tp_repr (self=0x7ffff684e8d0) at Objects/typeobject.c:5992
#48 0x0000000000488ef9 in PyObject_Repr (v=0x7ffff684e8d0) at Objects/object.c:482
#49 0x00000000004c03fb in unicode_fromformat_arg (writer=0x7fffffffdeb0, f=0x7ffff6ed9850 "R", vargs=0x7fffffffdef0) at Objects/unicodeobject.c:2645
#50 0x00000000004c0754 in PyUnicode_FromFormatV (format=0x7ffff6ed9846 "unclosed %R", vargs=0x7fffffffdf60) at Objects/unicodeobject.c:2710
#51 0x00000000004f90e7 in PyErr_WarnFormat (category=0x8a1a20 <_PyExc_ResourceWarning>, stack_level=1, format=0x7ffff6ed9846 "unclosed %R") at Python/_warnings.c:895
#52 0x00007ffff6ed438a in sock_dealloc (s=0x7ffff684e8d0) at /root/test/test35/cpython/Modules/socketmodule.c:4177
#53 0x000000000049d02c in subtype_dealloc (self=0x7ffff684e8d0) at Objects/typeobject.c:1209
#54 0x00000000004781c5 in free_keys_object (keys=0xa3fd90) at Objects/dictobject.c:351
#55 0x000000000047a908 in PyDict_Clear (op=0x7ffff7f55448) at Objects/dictobject.c:1388
#56 0x000000000047df02 in dict_tp_clear (op=0x7ffff7f55448) at Objects/dictobject.c:2595
#57 0x00000000005679de in delete_garbage (collectable=0x7fffffffe2f0, old=0x8ce440 <generations+64>) at Modules/gcmodule.c:866
#58 0x0000000000567e30 in collect (generation=2, n_collected=0x0, n_uncollectable=0x0, nofail=1) at Modules/gcmodule.c:1014
#59 0x0000000000568e84 in _PyGC_CollectNoFail () at Modules/gcmodule.c:1605
#60 0x0000000000532023 in PyImport_Cleanup () at Python/import.c:481
#61 0x0000000000542083 in Py_Finalize () at Python/pylifecycle.c:576
#62 0x0000000000420fdf in Py_Main (argc=2, argv=0x92d010) at Modules/main.c:798
#63 0x000000000041bc4f in main (argc=2, argv=0x7fffffffe748) at ./Programs/python.c:65
(gdb)

So this is something like a use-after-free.

I think the root cause is that finalizer that can call some python code is called at deallocation phase, interleaving with object's deletion. As well as with objects with __del__ method, GC can't know the order in which objects should be deleted if they are in a reference cycle.

This problem does not exist in 3.6 because there is a separate finalizing method sock_finalize there where a ResourceWarning is issued, and such finalizers are called for the whole reference cycle before any object in it is deallocated (at least this is how I understood, correct me if I'm wrong). So, this change with finalizers probably should be backported to previous versions of Python 3.

I also could not reproduce this in 3.3, but I didn't have time to research that closely yet.
History
Date User Action Args
2017-05-03 16:44:04asteritesetrecipients: + asterite
2017-05-03 16:44:02asteritesetmessageid: <1493829842.08.0.612214340964.issue30260@psf.upfronthosting.co.za>
2017-05-03 16:44:02asteritelinkissue30260 messages
2017-05-03 16:44:01asteritecreate