classification
Title: Integrate trashcan mechanism into _Py_Dealloc
Type: enhancement Stage: patch review
Components: Interpreter Core Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: nascheme, pablogsal, tim.peters, vstinner
Priority: normal Keywords: patch

Created on 2021-08-12 06:12 by nascheme, last changed 2021-08-16 19:06 by nascheme.

Files
File name Uploaded Description Edit
pyperf-trashcan.txt nascheme, 2021-08-15 21:02
perf-annotate-trash.txt nascheme, 2021-08-15 21:02
pyperf-trashcan2.txt nascheme, 2021-08-16 19:03
Pull Requests
URL Status Linked Edit
PR 27738 open nascheme, 2021-08-12 06:35
Messages (3)
msg399433 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2021-08-12 06:12
This is a WIP/proof-of-concept of doing away with Py_TRASHCAN_BEGIN and Py_TRASHCAN_END and instead integrating the functionality into _Py_Dealloc.  There are a few advantages:

- all container objects have the risk of overflowing the C stack if a long reference chain of them is created and then deallocated.  So, to be safe, the tp_dealloc methods for those objects should be protected from overflowing the stack.

- the Py_TRASHCAN_BEGIN and Py_TRASHCAN_END macros are hard to understand and a bit hard to use correctly.  Making the mechanism internal avoids confusion.  The code can be slightly simplified as well.

This proof-of-concept seems to pass tests but it will need some careful review.  The exact rules related to calling GC Track/Untrack are subtle and this changes things a bit.  I.e. tp_dealloc is called with GC objects already untracked.  For 3rd party extensions, they are calling PyObject_GC_UnTrack() and so I believe they should still work.  

The fact that PyObject_CallFinalizerFromDealloc() wants GC objects to definitely be tracked is a bit of a mystery to me (there is an assert to check that).  I changed the code to track objects if they are not already tracked but I'm not sure that's correct.

There could be a performance hit, due to the _PyType_IS_GC() test that was added to the _Py_Dealloc() function.  For non-GC objects, that's going to be a new branch and I'm worried it might hurt a bit.  OTOH, maybe it's just in the noise.  Profiling will need to be done.
msg399626 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2021-08-15 21:02
As I suspected, the performance impact is significant (although pretty small).  Based on Linux perf, it looks like the extra test+branch in _Py_Dealloc adds about 1% overhead.  pyperformance shows something similar, see attached reports (pypref-trashcan.txt and perf-annotate-trash.txt).

An idea I've been trying is to add another type slot, e.g. tp_call_dealloc.  It could be set by PyType_Ready().  It would be called by the Py_DECREF macro on refcnt going to zero.  If the object is non-GC and Py_TRACE_REFS is off, can make tp_call_dealloc actually be the tp_dealloc pointer.  If the type has the GC flag, point tp_call_dealloc to a _Py_Dealloc version that checks tp_is_gc and does the trashcan stuff.

Unfortunately, all types don't have PyType_Ready() called on them.  So, we cannot rely on tp_call_dealloc being set correctly.
msg399669 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2021-08-16 19:03
Based on some testing, I think an extra type slot is not worth the extra complication.  I made some small improvements to _Py_Dealloc and now the performance seems a bit better.  Basically, I expanded _PyObject_IS_GC() to put the most common branches first.  See pypref-trashcan2.txt for benchmark.  Overall I think might be a bit slower (less than 1%?)  but we have gained trashcan protection for the dealloc of all GC objects.  

If we are okay with the performance, here are other questions to answer:

A) is it safe to untrack GC objects before calling tp_dealloc?

B) is it safe to have PyObject_CallFinalizerFromDealloc() track objects that are resurrected?


We might have to look at 3rd party extensions to decide on those.  I.e. maybe in theory it is not safe but in practice there is no extension code that would actually break.
History
Date User Action Args
2021-08-16 19:06:19naschemesetnosy: + tim.peters, vstinner, pablogsal
2021-08-16 19:03:48naschemesetfiles: + pyperf-trashcan2.txt

messages: + msg399669
2021-08-15 21:02:32naschemesetfiles: + perf-annotate-trash.txt
2021-08-15 21:02:23naschemesetfiles: + pyperf-trashcan.txt

messages: + msg399626
2021-08-12 06:35:45naschemesetkeywords: + patch
pull_requests: + pull_request26218
2021-08-12 06:12:33naschemecreate