classification
Title: tp_dealloc trashcan shouldn't be called for subclasses
Type: behavior Stage: patch review
Components: Interpreter Core Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, jdemeyer, matrixise, pitrou, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2019-02-13 11:02 by jdemeyer, last changed 2019-05-15 14:34 by ned.deily.

Pull Requests
URL Status Linked Edit
PR 11841 merged jdemeyer, 2019-02-13 15:52
PR 12607 open jdemeyer, 2019-03-28 16:27
PR 12699 open jdemeyer, 2019-04-05 13:40
PR 12730 closed jdemeyer, 2019-04-08 14:56
Messages (19)
msg335405 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-02-13 11:02
When designing an extension type subclassing an existing type, it makes sense to call the tp_dealloc of the base class from the tp_dealloc of the subclass.

Now suppose that I'm subclassing "list" which uses the trashcan mechanism. Then it can happen that the tp_dealloc of list puts this object in the trashcan, even though the tp_dealloc of the subclass has already been called. Then the tp_dealloc of the subclass may be called multiple times, which is unsafe (tp_dealloc is expected to be called exactly once).

To solve this, the trashcan mechanism should be disabled when tp_dealloc is called from a subclass.

Interestingly, subtype_dealloc also solves this in a much more involved way (see the "Q. Why the bizarre (net-zero) manipulation of _PyRuntime.trash_delete_nesting around the trashcan macros?") in 
Objects/typeobject.c
msg335406 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2019-02-13 11:08
Jeroen, could you share your example? I am learning the C-API of Python and this example could be interesting.
msg335440 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-02-13 14:00
The problem is easily reproduced with Cython:

cdef class List(list):
    cdef int deallocated
    
    def __dealloc__(self):
        if self.deallocated:
            print("Deallocated twice!")
        self.deallocated = 1
        
L = None
for i in range(10**4):
    L = List((L,))
del L
msg335453 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-13 15:34
FWIW, subclassing builtin types is a relatively new thing.  There are still a number of lingering (older) implementation details throughout CPython that were written assuming no subclassing.  I'd guess that this is one of them.
msg335455 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-02-13 15:42
By "relatively new thing", you mean less than 20 years old? :-)
msg335463 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-02-13 16:28
On Wed, Feb 13, 2019 at 8:42 AM Antoine Pitrou <report@bugs.python.org> wrote:
> Antoine Pitrou <pitrou@free.fr> added the comment:
>
> By "relatively new thing", you mean less than 20 years old? :-)

Yeah, looks like it was in the 2.2 release (Dec 2001) for PEP 253.
Anyway, I know of several core devs who are still opposed to
subclassing builtin types. :)
msg335476 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-02-13 20:51
NOTE: also OrderedDict currently uses trashcan hacking to work around this problem:

    /* Call the base tp_dealloc().  Since it too uses the trashcan mechanism,
     * temporarily decrement trash_delete_nesting to prevent triggering it
     * and putting the partially deallocated object on the trashcan's
     * to-be-deleted-later list.
     */
    --tstate->trash_delete_nesting;
    assert(_tstate->trash_delete_nesting < PyTrash_UNWIND_LEVEL);
    PyDict_Type.tp_dealloc((PyObject *)self);
    ++tstate->trash_delete_nesting;

So this seems to be a known problem which deserves to be fixed properly.
msg335513 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-02-14 09:06
> Jeroen, could you share your example? I am learning the C-API of Python and this example could be interesting.

Either use the Cython code I posted above or run the testcase I added in PR 11841.
msg338976 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-27 17:25
Disabling the trashcan mechanism returns the problem for solving which the trashcan mechanism was introduced.

>>> from _testcapi import MyList
>>> L = None
>>> for i in range(1000000):
...     L = MyList((L,))
... 
>>> del L
Segmentation fault (core dumped)

I think we need better way to resolve this issue.

For example, setting the object type to the parent type should solve this issue.
msg338981 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-03-27 17:48
Yes of course. When not using the trashcan, stuff crashes. I don't get your point...
msg338982 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-03-27 17:56
To clarify: the purpose of MyList is specifically to check that no double deallocations occur. For this test to make sense, MyList should not use the trashcan itself.
msg339017 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-28 06:55
In your example you can add

    PyTypeObject *tp = Py_TYPE(self);
    Py_TYPE(self) = &PyList_Type;
    if (PyType_GetFlags(tp) & Py_TPFLAGS_HEAPTYPE) {
        Py_DECREF(tp);
    }

before calling PyList_Type.tp_dealloc().

It is possible to add such code directly in PyList_Type.tp_dealloc and other deallocators that use the trashcan mechanism.
msg339021 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-03-28 08:42
Changing types like that looks like an ugly hack and a recipe for breakage. For example, in list_dealloc(), the following needs the type to be correct:

    if (numfree < PyList_MAXFREELIST && PyList_CheckExact(op))
        free_list[numfree++] = op;
    else
        Py_TYPE(op)->tp_free((PyObject *)op);

Could you please clarify your opinion: do you think that there's something wrong with PR 11841? And if yes: what's wrong with it? Or are you just giving optional suggestions?
msg339022 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-03-28 08:55
Yes, I think that there's something wrong with PR 11841. It disables the trashcan mechanism and does not provide other mechanism for solving that problem.
msg339023 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-03-28 09:10
> It disables the trashcan mechanism

Yes, it disables the trashcan in some cases. But only when using the trashcan mechanism would probably crash CPython anyway due to a double deallocation. So at the very least, PR 11841 improves things from "crashing whenever the trashcan is used" to "crashing on stack overflows".

Do you have a real example where PR 11841 actually makes things *worse*?

> and does not provide other mechanism for solving that problem.

The recommended thing to do is that the subclass also implements the trashcan. See OrderedDict for an example: both the base class "dict" and the subclass "OrderedDict" use the trashcan.
msg339119 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-03-29 15:02
I created an additional PR 12607 with some more changes to the, in particular to make the old backwards-compatibility trashcan macros safer. This should be seen as part of the same bugfix but I decided to make a new PR because PR 11841 had several positive reviews already.
msg339519 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-04-05 20:35
I realized that there is a nasty interaction between the trashcan and __del__: if you're very close to the trashcan limit and you're calling __del__, then objects that should have been deallocated in __del__ (in particular, an object involving self) might instead end up in the trashcan. So self might be resurrected when it shouldn't be.

This is causing test failures in the 2.7 backport due to a different implementation of __del__.
msg339609 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2019-04-08 10:36
In Python 3, the resurrection issue probably appears too. But it's not so much a problem since __del__ (mapped to tp_finalize) is only called once anyway. So there are no bad consequences if the object is resurrected incorrectly.
msg342080 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-05-10 17:21
New changeset 351c67416ba4451eb3928fa0b2e933c2f25df1a3 by Antoine Pitrou (Jeroen Demeyer) in branch 'master':
bpo-35983: skip trashcan for subclasses (GH-11841)
https://github.com/python/cpython/commit/351c67416ba4451eb3928fa0b2e933c2f25df1a3
History
Date User Action Args
2019-05-15 14:34:02ned.deilysetversions: - Python 2.7, Python 3.7
2019-05-10 17:21:33pitrousetmessages: + msg342080
2019-04-08 14:56:18jdemeyersetpull_requests: + pull_request12653
2019-04-08 10:36:43jdemeyersetmessages: + msg339609
2019-04-05 20:35:14jdemeyersetmessages: + msg339519
2019-04-05 13:40:42jdemeyersetpull_requests: + pull_request12624
2019-03-29 15:02:37jdemeyersetmessages: + msg339119
2019-03-28 16:27:12jdemeyersetpull_requests: + pull_request12546
2019-03-28 09:10:13jdemeyersetmessages: + msg339023
2019-03-28 08:55:47serhiy.storchakasetmessages: + msg339022
2019-03-28 08:42:13jdemeyersetmessages: + msg339021
2019-03-28 06:55:54serhiy.storchakasetmessages: + msg339017
2019-03-27 17:56:18jdemeyersetmessages: + msg338982
2019-03-27 17:48:10jdemeyersetmessages: + msg338981
2019-03-27 17:25:35serhiy.storchakasetmessages: + msg338976
2019-03-27 14:40:46serhiy.storchakasetnosy: + serhiy.storchaka
2019-02-14 09:06:42jdemeyersetmessages: + msg335513
2019-02-13 20:51:51jdemeyersetmessages: + msg335476
2019-02-13 16:28:05eric.snowsetmessages: + msg335463
2019-02-13 15:52:37jdemeyersetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request11871
2019-02-13 15:42:39pitrousetmessages: + msg335455
2019-02-13 15:34:50eric.snowsetnosy: + eric.snow
messages: + msg335453

type: behavior
stage: test needed
2019-02-13 14:00:11jdemeyersetmessages: + msg335440
2019-02-13 11:08:00matrixisesetnosy: + matrixise
messages: + msg335406
2019-02-13 11:02:31jdemeyercreate