Issue44881
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2021-08-10 21:07 by nascheme, last changed 2022-04-11 14:59 by admin.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 27718 | open | nascheme, 2021-08-10 21:49 |
Messages (15) | |||
---|---|---|---|
msg399356 - (view) | Author: Neil Schemenauer (nascheme) * | Date: 2021-08-10 21:07 | |
The fix for bpo-33930 includes a somewhat mysterious comment: // The Py_TRASHCAN mechanism requires that we be able to // call PyObject_GC_UnTrack twice on an object. I wonder if we can just integrate the untrack into the body of the trashcan code. Then, the explicit call to untrack in the dealloc function body can be removed. That removes the risk of incorrectly using the macro version. I suspect the reason this was not done originally is because the original trashcan mechanism did not use the GC header pointers to store objects. Now, any object that uses the trashcan *must* be a GC object. |
|||
msg399358 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2021-08-10 21:28 | |
We probably will need a new set of macros because these macros are public IIRC correctly. Although as PyObject_GC_UnTrack has a check, we probably can just absorb it and let backwards compat work by being idempotent if called twice |
|||
msg399359 - (view) | Author: Neil Schemenauer (nascheme) * | Date: 2021-08-10 22:02 | |
Extensions that call PyObject_GC_UnTrack before calling Py_TRASHCAN_BEGIN will still work, they will just take a very minor performance hit. I don't think it is worth the trouble to introduce new macros for that reason. Extensions that really care about performance can wrap the call in a Python version ifdef. There is an issue if someone writes and tests their extension with the new API, i.e. without having the explicit PyObject_GC_UnTrack() call in their dealloc method. If they compile with an older Python, they likely get a crash. If they compile with asserts enable, they would get an assert fail in _PyTrash_thread_deposit_object, i.e.: _PyObject_ASSERT(op, !_PyObject_GC_IS_TRACKED(op)); I guess that's an argument for new macros. |
|||
msg399360 - (view) | Author: Neil Schemenauer (nascheme) * | Date: 2021-08-10 22:06 | |
For examples of bugs caused by forgetting the untrack calls, see bpo-31095. |
|||
msg399367 - (view) | Author: Neil Schemenauer (nascheme) * | Date: 2021-08-10 23:38 | |
Since C99 is now allowed, perhaps we should suggest that declarations go after Py_TRASHCAN_BEGIN, e.g. mytype_dealloc(mytype *p) { Py_TRASHCAN_BEGIN(p, mytype_dealloc) ... declarations go here ... ... The body of the deallocator goes here, including all calls ... ... to Py_DECREF on contained objects. ... Py_TRASHCAN_END // there should be no code after this } The only dealloc method that doesn't fit this pattern is subtype_dealloc() and that's because the object might not be a GC object. Given the pattern, it occurs to me that that _Py_Dealloc() could do the trashcan begin/end work. However, the authors of the dealloc methods would have to realize the special rules of the trashcan (e.g. no returns from the dealloc method body). Also, there would be some overhead added to _Py_Dealloc() to check if the trashcan is supported. E.g. checking a type flag. Another idea would be to add a new slot for the method, e.g. tp_dealloc_trash. Then, _Py_Dealloc() could check if it exists and if so do the trashcan begin/end dance around it. That would still add some overhead to _Py_Dealloc() so I think the current macros are the best approach. |
|||
msg399377 - (view) | Author: STINNER Victor (vstinner) * | Date: 2021-08-11 09:36 | |
Please don't add new macros. The TRASHCAN C API is already complex enough. |
|||
msg399378 - (view) | Author: STINNER Victor (vstinner) * | Date: 2021-08-11 09:42 | |
Moreover, before making any change, I would suggest by starting with documenting the existing trashcan C API. Right now, there is no documentation in Doc/c-api/. Documentation can be copied from Include/cpython/object.h. Last year, I modified the trashcan API to hide implementation details. There were also issues with the stable ABI... thing which is unclear to me, since the trashcan C API is excluded from the limited C API. Well, I excluded it. Previously, it was part of it, but it did not work with the limited C API. commit 0fa4f43db086ac3459811cca4ec5201ffbee694a Author: Victor Stinner <vstinner@python.org> Date: Wed Feb 5 12:23:27 2020 +0100 bpo-39542: Exclude trashcan from the limited C API (GH-18362) Exclude trashcan mechanism from the limited C API: it requires access to PyTypeObject and PyThreadState structure fields, whereas these structures are opaque in the limited C API. The trashcan mechanism never worked with the limited C API. Move it from object.h to cpython/object.h. commit 38965ec5411da60d312b59be281f3510d58e0cf1 Author: Victor Stinner <vstinner@python.org> Date: Fri Mar 13 16:51:52 2020 +0100 bpo-39947: Hide implementation detail of trashcan macros (GH-18971) Py_TRASHCAN_BEGIN_CONDITION and Py_TRASHCAN_END macro no longer access PyThreadState attributes, but call new private _PyTrash_begin() and _PyTrash_end() functions which hide implementation details. commit db532a09990c837ec1348e6e6bd2719f5d4a8216 Author: Victor Stinner <vstinner@python.org> Date: Wed Jun 23 15:51:47 2021 +0200 bpo-39947: Remove old private trashcan C API functions (GH-26869) Remove 4 C API private trashcan functions which were only kept for the backward compatibility of the stable ABI with Python 3.8 and older, since the trashcan API was not usable with the limited C API on Python 3.8 and older. The trashcan API was excluded from the limited C API in Python 3.9. Removed functions: * _PyTrash_deposit_object() * _PyTrash_destroy_chain() * _PyTrash_thread_deposit_object() * _PyTrash_thread_destroy_chain() The trashcan C API was never usable with the limited C API, since old trashcan macros accessed directly PyThreadState members like "_tstate->trash_delete_nesting", whereas the PyThreadState structure is opaque in the limited C API. Exclude also the PyTrash_UNWIND_LEVEL constant from the C API. The trashcan C API was modified in Python 3.9 by commit 38965ec5411da60d312b59be281f3510d58e0cf1 and in Python 3.10 by commit ed1a5a5baca8f61e9a99c5be3adc16b1801514fe to hide implementation details. |
|||
msg399379 - (view) | Author: STINNER Victor (vstinner) * | Date: 2021-08-11 09:46 | |
> Now, any object that uses the trashcan *must* be a GC object. It would be nice to add an assertion in _PyTrash_begin(). |
|||
msg399426 - (view) | Author: Neil Schemenauer (nascheme) * | Date: 2021-08-12 00:19 | |
I was thinking about this more today and I think the better fix is to actually build the trashcan mechanism into _Py_Dealloc(). Requiring that types opt-in to the trashcan mechanism by using the trashcan macros is not ideal. First, the trashcan macros are a bit tricky to use correctly. Second, every "container" type is potentially a part of a long ref chain and could blow up the stack on deallocation (i.e. triggered from DECREF). So, for correctness/robustness, every type that supports cyclic GC should get trashcan-style deallocation. We would have to find a way to do this without incurring a (significant) performance hit. Also, it would have to be done without breaking C extensions. Ideally they would get trashcan-style deallocation without any source code changes. I'm not sure if that can be done but I'm hopeful it's possible. |
|||
msg399434 - (view) | Author: Neil Schemenauer (nascheme) * | Date: 2021-08-12 06:38 | |
I wrote a proof-of-concept as bpo-44897. PR 27718 (this issue) might a slightly better performance but I like the other one better because it doesn't expose so much implementation detail to extension types. Either of them require careful review before merging. |
|||
msg399437 - (view) | Author: STINNER Victor (vstinner) * | Date: 2021-08-12 10:00 | |
The problem of PyObject_GC_UnTrack() is just the most visible effect of the trashcan mecanism: tp_dealloc can be called twice, and this is *not* expected by the tp_dealloc API. Putting trashcan mecanism outside tp_dealloc can allow to make sure that tp_dealloc is called exactly once. _Py_Dealloc() sounds like a good candidate, but I didn't check if it's the only way to call tp_dealloc. Are there other places where tp_dealloc is called *directly*? Using military grade regex, I found the following functions calling tp_dealloc: grep 'dealloc[) ]*([a-zA-Z]\+)' */*.c * _Py_Dealloc() obviously * _PyTrash_thread_destroy_chain() * subtype_dealloc() * Modules/_testcapimodule.c: check_pyobject_freed_is_freed() <= unit test, it can be ignored So if we move the trashcan usage inside functions, 3 functions must be modified: * _Py_Dealloc() * _PyTrash_thread_destroy_chain() * subtype_dealloc() |
|||
msg399495 - (view) | Author: Neil Schemenauer (nascheme) * | Date: 2021-08-13 06:44 | |
> The problem of PyObject_GC_UnTrack() is just the most visible effect of the > trashcan mecanism: tp_dealloc can be called twice, and this is *not* expected > by the tp_dealloc API. The fact that Py_TRASHCAN_BEGIN and Py_TRASHCAN_END can cause tp_dealloc to be called more than once is not a huge issue, IMHO. The statements executed more than once are the ones before or after the trashcan macros. If you use them properly, the multiple calls doesn't cause issues. The big trap is to use _PyObject_GC_UNTRACK before Py_TRASHCAN_BEGIN. It is not safe to call that macro multiple times. Anywhere you see Py_TRASHCAN_BEGIN and _PyObject_GC_UNTRACK before it, that's a bug. The point of this PR is to make the macros harder to use incorrectly. If there is no need to call untrack, Py_TRASHCAN_BEGIN and Py_TRASHCAN_END can normally be the first and last lines of the tp_dealloc function (declarations can be inside). Then, the fact that tp_dealloc is actually called more than once by the trashcan doesn't cause an issue. tp_dealloc can get called more than once for another reason. If the object is resurrected by tp_finalize or tp_del, then tp_dealloc can be called again. It's pretty mind bending to try to understand all of implications of that. Most of the ugliness is inside PyObject_CallFinalizerFromDealloc(). The type with the finalizer has to be careful not to put the object in an invalid state before it gets resurrected. > Putting trashcan mecanism outside tp_dealloc can allow to make sure that > tp_dealloc is called exactly once. _Py_Dealloc() sounds like a good > candidate, but I didn't check if it's the only way to call tp_dealloc. Are > there other places where tp_dealloc is called *directly*? tp_dealloc is called from a few more places, as you found. As for providing C-stack exhaustion protection via the trashcan, I think we are only interested in places were tp_dealloc is called as a result of a DECREF. And I think that only goes through _Py_Dealloc(). I suppose it would be possible to blow up the C stack via a call chain like: subtype_dealloc -> basedealloc -> subtype_dealloc -> basedealloc -> ... I think you would have to make a long chain of subclasses. Seems unlikely in real code so I'm not worried about trying to fix that. > Using military grade regex, I found the following functions calling tp_dealloc: > > grep 'dealloc[) ]*([a-zA-Z]\+)' */*.c > > * _Py_Dealloc() obviously > * _PyTrash_thread_destroy_chain() > * subtype_dealloc() > * Modules/_testcapimodule.c: check_pyobject_freed_is_freed() <= unit test, it can be ignored > > So if we move the trashcan usage inside functions, 3 functions must be modified: > > * _Py_Dealloc() > * _PyTrash_thread_destroy_chain() > * subtype_dealloc() Take a look at bpo-44897. It implements the _Py_Dealloc approach. You can see what needs to be modified. I removed the Py_TRASHCAN_BEGIN/Py_TRASHCAN_END macros as well, just because they are not needed anymore. _PyObject_GC_UNTRACK calls inside tp_dealloc methods need to be removed because _Py_Dealloc already does the untrack. For external extensions, they must be using PyObject_GC_UnTrack() and so that's safe (object is already untracked and so it does nothing). I think the big change is calling tp_dealloc methods with the object already untracked. If an extension did something like: mytype_dealloc(PyObject *self) { ... assert(PyObject_GC_IsTracked(self)); ... } That would break after PR 27738, at least as it's currently written. The other issue I'm not quite sure about is that PyObject_CallFinalizerFromDealloc insists that GC objects are tracked when that function is called. I think it is okay to just call _PyObject_GC_TRACK on the ones that are not tracked, to ensure that. Maybe Tim Peters will jump in and correct the errors of my thinking. |
|||
msg399519 - (view) | Author: Pablo Galindo Salgado (pablogsal) * | Date: 2021-08-13 10:41 | |
Thanks Neil for the thorough explanation! I think in any case we should benchmark this because this will affect *all* GC types if I understand correctly and the TS mechanism had shown slowdowns before |
|||
msg399548 - (view) | Author: Neil Schemenauer (nascheme) * | Date: 2021-08-13 17:31 | |
> I think in any case we should benchmark this because this will affect *all* > GC types if I understand correctly and the TS mechanism had shown slowdowns > before We definitely need to benchmark. I would guess that adding trashcan protection to all GC types would not be a performance issue. We already had the macros for some performance critical ones (frame, tuple, list, dict). The performance hit will likely come as a result of adding a branch that uses _PyType_IS_GC() to the DECREF code path. It means any time an object hits zero refcount, we call _PyType_IS_GC() on it. Previously, we would just call tp_dealloc. Because of PEP 442, _PyType_IS_GC() checks not only a type flag but might also call tp_is_gc. So, benchmarks will need to be done. We might get a small win because the trashcan logic can be better optimized now that it's in a single complied unit. Small correction for my explaination above: if tp_dealloc gets called mutiple times because of the trashcan, it is the code before the BEGIN macro that gets called mutiple times. |
|||
msg399686 - (view) | Author: Neil Schemenauer (nascheme) * | Date: 2021-08-16 21:40 | |
Another small correction, _PyType_IS_GC() checks only the type flag (Py_TPFLAGS_HAVE_GC). _PyObject_IS_GC() checks both the type flag and calls tp_is_gc(), if it exists. tp_is_gc() is wart, IMHO, and it would be nice if we can kill it off so only the type flag is needed. That's a topic for a different issue though. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:48 | admin | set | github: 89044 |
2021-08-16 21:40:17 | nascheme | set | messages: + msg399686 |
2021-08-13 17:31:19 | nascheme | set | messages: + msg399548 |
2021-08-13 10:41:08 | pablogsal | set | messages: + msg399519 |
2021-08-13 06:44:02 | nascheme | set | nosy:
+ tim.peters messages: + msg399495 |
2021-08-12 10:00:41 | vstinner | set | messages: + msg399437 |
2021-08-12 06:38:42 | nascheme | set | messages: + msg399434 |
2021-08-12 00:19:22 | nascheme | set | messages: + msg399426 |
2021-08-11 09:46:27 | vstinner | set | messages: + msg399379 |
2021-08-11 09:43:05 | vstinner | set | title: Consider integration of GC_UNTRACK with TRASHCAN -> Consider integration of PyObject_GC_UnTrack() with the trashcan C API |
2021-08-11 09:42:32 | vstinner | set | messages: + msg399378 |
2021-08-11 09:36:59 | vstinner | set | nosy:
+ vstinner messages: + msg399377 |
2021-08-10 23:38:25 | nascheme | set | messages: + msg399367 |
2021-08-10 22:06:12 | nascheme | set | messages: + msg399360 |
2021-08-10 22:02:12 | nascheme | set | messages:
+ msg399359 stage: patch review -> needs patch |
2021-08-10 21:49:40 | nascheme | set | keywords:
+ patch stage: needs patch -> patch review pull_requests: + pull_request26201 |
2021-08-10 21:28:41 | pablogsal | set | nosy:
+ pablogsal messages: + msg399358 |
2021-08-10 21:07:59 | nascheme | create |