Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider integration of PyObject_GC_UnTrack() with the trashcan C API #89044

Open
nascheme opened this issue Aug 10, 2021 · 15 comments
Open

Consider integration of PyObject_GC_UnTrack() with the trashcan C API #89044

nascheme opened this issue Aug 10, 2021 · 15 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API type-feature A feature request or enhancement

Comments

@nascheme
Copy link
Member

BPO 44881
Nosy @tim-one, @nascheme, @vstinner, @pablogsal
PRs
  • bpo-44881: Integrate GC untrack into trashcan begin. #27718
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-08-10.21:07:59.473>
    labels = ['type-feature']
    title = 'Consider integration of PyObject_GC_UnTrack() with the trashcan C API'
    updated_at = <Date 2021-08-16.21:40:17.744>
    user = 'https://github.com/nascheme'

    bugs.python.org fields:

    activity = <Date 2021-08-16.21:40:17.744>
    actor = 'nascheme'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = []
    creation = <Date 2021-08-10.21:07:59.473>
    creator = 'nascheme'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44881
    keywords = ['patch']
    message_count = 15.0
    messages = ['399356', '399358', '399359', '399360', '399367', '399377', '399378', '399379', '399426', '399434', '399437', '399495', '399519', '399548', '399686']
    nosy_count = 4.0
    nosy_names = ['tim.peters', 'nascheme', 'vstinner', 'pablogsal']
    pr_nums = ['27718']
    priority = 'normal'
    resolution = None
    stage = 'needs patch'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue44881'
    versions = []

    @nascheme
    Copy link
    Member Author

    The fix for bpo-33930 includes a somewhat mysterious comment:

    // The Py_TRASHCAN mechanism requires that we be able to
    // call PyObject_GC_UnTrack twice on an object.
    

    I wonder if we can just integrate the untrack into the body of the trashcan code. Then, the explicit call to untrack in the dealloc function body can be removed. That removes the risk of incorrectly using the macro version.

    I suspect the reason this was not done originally is because the original trashcan mechanism did not use the GC header pointers to store objects. Now, any object that uses the trashcan *must* be a GC object.

    @nascheme nascheme added type-feature A feature request or enhancement labels Aug 10, 2021
    @pablogsal
    Copy link
    Member

    We probably will need a new set of macros because these macros are public IIRC correctly. Although as PyObject_GC_UnTrack has a check, we probably can just absorb it and let backwards compat work by being idempotent if called twice

    @nascheme
    Copy link
    Member Author

    Extensions that call PyObject_GC_UnTrack before calling Py_TRASHCAN_BEGIN will still work, they will just take a very minor performance hit. I don't think it is worth the trouble to introduce new macros for that reason. Extensions that really care about performance can wrap the call in a Python version ifdef.

    There is an issue if someone writes and tests their extension with the new API, i.e. without having the explicit PyObject_GC_UnTrack() call in their dealloc method. If they compile with an older Python, they likely get a crash. If they compile with asserts enable, they would get an assert fail in _PyTrash_thread_deposit_object, i.e.:

    _PyObject_ASSERT(op, !_PyObject_GC_IS_TRACKED(op));
    

    I guess that's an argument for new macros.

    @nascheme
    Copy link
    Member Author

    For examples of bugs caused by forgetting the untrack calls, see bpo-31095.

    @nascheme
    Copy link
    Member Author

    Since C99 is now allowed, perhaps we should suggest that declarations go after Py_TRASHCAN_BEGIN, e.g.

    	mytype_dealloc(mytype *p)
    	{
    	    Py_TRASHCAN_BEGIN(p, mytype_dealloc)
    	    ... declarations go here ...
    	    ... The body of the deallocator goes here, including all calls ...
    	    ... to Py_DECREF on contained objects.                         ...
    	    Py_TRASHCAN_END                // there should be no code after this
    	}

    The only dealloc method that doesn't fit this pattern is subtype_dealloc() and that's because the object might not be a GC object.

    Given the pattern, it occurs to me that that _Py_Dealloc() could do the trashcan begin/end work. However, the authors of the dealloc methods would have to realize the special rules of the trashcan (e.g. no returns from the dealloc method body). Also, there would be some overhead added to _Py_Dealloc() to check if the trashcan is supported. E.g. checking a type flag.

    Another idea would be to add a new slot for the method, e.g. tp_dealloc_trash. Then, _Py_Dealloc() could check if it exists and if so do the trashcan begin/end dance around it. That would still add some overhead to _Py_Dealloc() so I think the current macros are the best approach.

    @vstinner
    Copy link
    Member

    Please don't add new macros. The TRASHCAN C API is already complex enough.

    @vstinner
    Copy link
    Member

    Moreover, before making any change, I would suggest by starting with documenting the existing trashcan C API. Right now, there is no documentation in Doc/c-api/. Documentation can be copied from Include/cpython/object.h.

    Last year, I modified the trashcan API to hide implementation details. There were also issues with the stable ABI... thing which is unclear to me, since the trashcan C API is excluded from the limited C API. Well, I excluded it. Previously, it was part of it, but it did not work with the limited C API.

    commit 0fa4f43
    Author: Victor Stinner <vstinner@python.org>
    Date: Wed Feb 5 12:23:27 2020 +0100

    bpo-39542: Exclude trashcan from the limited C API (GH-18362)
    
    Exclude trashcan mechanism from the limited C API: it requires access to
    PyTypeObject and PyThreadState structure fields, whereas these structures
    are opaque in the limited C API.
    
    The trashcan mechanism never worked with the limited C API. Move it
    from object.h to cpython/object.h.
    

    commit 38965ec
    Author: Victor Stinner <vstinner@python.org>
    Date: Fri Mar 13 16:51:52 2020 +0100

    bpo-39947: Hide implementation detail of trashcan macros (GH-18971)
    
    Py_TRASHCAN_BEGIN_CONDITION and Py_TRASHCAN_END macro no longer
    access PyThreadState attributes, but call new private
    _PyTrash_begin() and _PyTrash_end() functions which hide
    implementation details.
    

    commit db532a0
    Author: Victor Stinner <vstinner@python.org>
    Date: Wed Jun 23 15:51:47 2021 +0200

    bpo-39947: Remove old private trashcan C API functions (GH-26869)
    
    Remove 4 C API private trashcan functions which were only kept for
    the backward compatibility of the stable ABI with Python 3.8 and
    older, since the trashcan API was not usable with the limited C API
    on Python 3.8 and older. The trashcan API was excluded from the
    limited C API in Python 3.9.
    
    Removed functions:
    
    * _PyTrash_deposit_object()
    * _PyTrash_destroy_chain()
    * _PyTrash_thread_deposit_object()
    * _PyTrash_thread_destroy_chain()
    
    The trashcan C API was never usable with the limited C API, since old
    trashcan macros accessed directly PyThreadState members like
    "_tstate->trash_delete_nesting", whereas the PyThreadState structure
    is opaque in the limited C API.
    
    Exclude also the PyTrash_UNWIND_LEVEL constant from the C API.
    
    The trashcan C API was modified in Python 3.9 by commit
    38965ec5411da60d312b59be281f3510d58e0cf1 and in Python 3.10 by commit
    ed1a5a5baca8f61e9a99c5be3adc16b1801514fe to hide implementation
    details.
    

    @vstinner vstinner changed the title Consider integration of GC_UNTRACK with TRASHCAN Consider integration of PyObject_GC_UnTrack() with the trashcan C API Aug 11, 2021
    @vstinner vstinner changed the title Consider integration of GC_UNTRACK with TRASHCAN Consider integration of PyObject_GC_UnTrack() with the trashcan C API Aug 11, 2021
    @vstinner
    Copy link
    Member

    Now, any object that uses the trashcan *must* be a GC object.

    It would be nice to add an assertion in _PyTrash_begin().

    @nascheme
    Copy link
    Member Author

    I was thinking about this more today and I think the better fix is to actually build the trashcan mechanism into _Py_Dealloc(). Requiring that types opt-in to the trashcan mechanism by using the trashcan macros is not ideal.

    First, the trashcan macros are a bit tricky to use correctly. Second, every "container" type is potentially a part of a long ref chain and could blow up the stack on deallocation (i.e. triggered from DECREF). So, for correctness/robustness, every type that supports cyclic GC should get trashcan-style deallocation.

    We would have to find a way to do this without incurring a (significant) performance hit. Also, it would have to be done without breaking C extensions. Ideally they would get trashcan-style deallocation without any source code changes. I'm not sure if that can be done but I'm hopeful it's possible.

    @nascheme
    Copy link
    Member Author

    I wrote a proof-of-concept as bpo-44897. PR 27718 (this issue) might a slightly better performance but I like the other one better because it doesn't expose so much implementation detail to extension types. Either of them require careful review before merging.

    @vstinner
    Copy link
    Member

    The problem of PyObject_GC_UnTrack() is just the most visible effect of the trashcan mecanism: tp_dealloc can be called twice, and this is *not* expected by the tp_dealloc API.

    Putting trashcan mecanism outside tp_dealloc can allow to make sure that tp_dealloc is called exactly once. _Py_Dealloc() sounds like a good candidate, but I didn't check if it's the only way to call tp_dealloc. Are there other places where tp_dealloc is called *directly*?

    Using military grade regex, I found the following functions calling tp_dealloc:

    grep 'dealloc[) ]([a-zA-Z]\+)' */.c

    • _Py_Dealloc() obviously
    • _PyTrash_thread_destroy_chain()
    • subtype_dealloc()
    • Modules/_testcapimodule.c: check_pyobject_freed_is_freed() <= unit test, it can be ignored

    So if we move the trashcan usage inside functions, 3 functions must be modified:

    • _Py_Dealloc()
    • _PyTrash_thread_destroy_chain()
    • subtype_dealloc()

    @nascheme
    Copy link
    Member Author

    The problem of PyObject_GC_UnTrack() is just the most visible effect of the
    trashcan mecanism: tp_dealloc can be called twice, and this is *not* expected
    by the tp_dealloc API.

    The fact that Py_TRASHCAN_BEGIN and Py_TRASHCAN_END can cause tp_dealloc to be
    called more than once is not a huge issue, IMHO. The statements executed more
    than once are the ones before or after the trashcan macros. If you use them
    properly, the multiple calls doesn't cause issues. The big trap is
    to use _PyObject_GC_UNTRACK before Py_TRASHCAN_BEGIN. It is not safe to call
    that macro multiple times. Anywhere you see Py_TRASHCAN_BEGIN and
    _PyObject_GC_UNTRACK before it, that's a bug.

    The point of this PR is to make the macros harder to use incorrectly. If there
    is no need to call untrack, Py_TRASHCAN_BEGIN and Py_TRASHCAN_END can normally
    be the first and last lines of the tp_dealloc function (declarations can be
    inside). Then, the fact that tp_dealloc is actually called more than once by the
    trashcan doesn't cause an issue.

    tp_dealloc can get called more than once for another reason. If the object is
    resurrected by tp_finalize or tp_del, then tp_dealloc can be called again.
    It's pretty mind bending to try to understand all of implications of that. Most
    of the ugliness is inside PyObject_CallFinalizerFromDealloc(). The type with
    the finalizer has to be careful not to put the object in an invalid state
    before it gets resurrected.

    Putting trashcan mecanism outside tp_dealloc can allow to make sure that
    tp_dealloc is called exactly once. _Py_Dealloc() sounds like a good
    candidate, but I didn't check if it's the only way to call tp_dealloc. Are
    there other places where tp_dealloc is called *directly*?

    tp_dealloc is called from a few more places, as you found. As for providing
    C-stack exhaustion protection via the trashcan, I think we are only interested
    in places were tp_dealloc is called as a result of a DECREF. And I think that
    only goes through _Py_Dealloc().

    I suppose it would be possible to blow up the C stack via a call chain like:

    subtype_dealloc -> basedealloc -> subtype_dealloc -> basedealloc -> ...
    

    I think you would have to make a long chain of subclasses. Seems unlikely in
    real code so I'm not worried about trying to fix that.

    Using military grade regex, I found the following functions calling tp_dealloc:

    grep 'dealloc[) ]([a-zA-Z]\+)' */.c

    • _Py_Dealloc() obviously
    • _PyTrash_thread_destroy_chain()
    • subtype_dealloc()
    • Modules/_testcapimodule.c: check_pyobject_freed_is_freed() <= unit test, it can be ignored

    So if we move the trashcan usage inside functions, 3 functions must be modified:

    • _Py_Dealloc()
    • _PyTrash_thread_destroy_chain()
    • subtype_dealloc()

    Take a look at bpo-44897. It implements the _Py_Dealloc approach. You can see
    what needs to be modified. I removed the Py_TRASHCAN_BEGIN/Py_TRASHCAN_END
    macros as well, just because they are not needed anymore. _PyObject_GC_UNTRACK
    calls inside tp_dealloc methods need to be removed because _Py_Dealloc already
    does the untrack. For external extensions, they must be using
    PyObject_GC_UnTrack() and so that's safe (object is already untracked and so it
    does nothing).

    I think the big change is calling tp_dealloc methods with the object already
    untracked. If an extension did something like:

    mytype_dealloc(PyObject *self)
    {
    	...
    	assert(PyObject_GC_IsTracked(self));
    	...
    }

    That would break after PR 27738, at least as it's currently written.

    The other issue I'm not quite sure about is that
    PyObject_CallFinalizerFromDealloc insists that GC objects are tracked when that
    function is called. I think it is okay to just call _PyObject_GC_TRACK on the
    ones that are not tracked, to ensure that. Maybe Tim Peters will jump in and
    correct the errors of my thinking.

    @pablogsal
    Copy link
    Member

    Thanks Neil for the thorough explanation!

    I think in any case we should benchmark this because this will affect *all* GC types if I understand correctly and the TS mechanism had shown slowdowns before

    @nascheme
    Copy link
    Member Author

    I think in any case we should benchmark this because this will affect *all*
    GC types if I understand correctly and the TS mechanism had shown slowdowns
    before

    We definitely need to benchmark. I would guess that adding trashcan protection
    to all GC types would not be a performance issue. We already had the macros
    for some performance critical ones (frame, tuple, list, dict).

    The performance hit will likely come as a result of adding a branch that
    uses _PyType_IS_GC() to the DECREF code path. It means any time an object hits
    zero refcount, we call _PyType_IS_GC() on it. Previously, we would just call
    tp_dealloc.

    Because of PEP-442, _PyType_IS_GC() checks not only a type flag but might also
    call tp_is_gc. So, benchmarks will need to be done. We might get a small win
    because the trashcan logic can be better optimized now that it's in a single
    complied unit.

    Small correction for my explaination above: if tp_dealloc gets called mutiple
    times because of the trashcan, it is the code before the BEGIN macro that gets
    called mutiple times.

    @nascheme
    Copy link
    Member Author

    Another small correction, _PyType_IS_GC() checks only the type flag (Py_TPFLAGS_HAVE_GC). _PyObject_IS_GC() checks both the type flag and calls tp_is_gc(), if it exists. tp_is_gc() is wart, IMHO, and it would be nice if we can kill it off so only the type flag is needed. That's a topic for a different issue though.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 28, 2023
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-C-API type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants