This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [subinterpreters] GC should happen when a subinterpreter is destroyed
Type: behavior Stage: resolved
Components: Subinterpreters Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: emptysquare, eric.snow, grahamd, ncoghlan, phsilva, pitrou, vstinner
Priority: normal Keywords:

Created on 2015-07-03 00:41 by eric.snow, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (14)
msg246110 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2015-07-03 00:41
Per A. Jesse Jiryu Davis:

===========================
# mod.py
class C(object):
    pass

class Pool(object):
    def __del__(self):
        print('del')
        list()

C.pool = Pool()
===========================

===========================

int main()
{
    Py_Initialize();
    PyThreadState *tstate_enter = PyThreadState_Get();
    PyThreadState *tstate = Py_NewInterpreter();

    PyRun_SimpleString("import mod\n");
    if (PyErr_Occurred()) {
        PyErr_Print();
    }
    Py_EndInterpreter(tstate);
    PyThreadState_Swap(tstate_enter);
    printf("about to finalize\n");
    Py_Finalize();
    printf("done\n");

    return 0;
}
===========================

See:
http://emptysqua.re/blog/a-normal-accident-in-python-and-mod-wsgi/
https://github.com/GrahamDumpleton/mod_wsgi/issues/43
msg246157 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-07-03 10:25
What is that supposed to demonstrate? GC is a global operation and is not tied to subinterpreters: GC may happen in any interpreter, not necessarily the one where the resource was allocated.
msg246158 - (view) Author: Graham Dumpleton (grahamd) Date: 2015-07-03 10:30
That GC happens on an object in the wrong interpreter in this case is the problem as it can result in used code execution against the wrong interpreter context.

If you are saying this can happen anytime in the life of a sub interpreter and not just in this case of when a sub interpreter is destroyed followed by destruction of the main interpreter, then that is an even bigger flaw.
msg246159 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-07-03 10:36
It's just a consequence of subinterpreters not being isolated contexts. They're sharing of lot of stuff by construction (hence being called "subinterpreters"). And indeed some resource can become unreachable in a subinterpreter, and collected from another, if the resource is part of a reference cycle.
msg246160 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-07-03 10:41
I don't think this is a very important issue, by the way. Normal destructors will usually rely on resources on their global environment, i.e. the function's globals or builtins dict, which will point to the right namespace. Only if you are explicitly looking up something on the interpreter (or using e.g. thread-local storage... but relying on thread-local storage in a destructor is already broken anyway) will you see such discrepancies.
msg246161 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-07-03 10:45
From a quick look at the PyInterpreterState, stuff that may be risky to rely on:
- mutable data from the sys module (mainly import-related data: sys.path, sys.meta_path, etc.)
- codecs registry metadata

Of course third-party modules (C extensions) may key additional data on the current interpreter, but I can't think of any stdlib module that does.
msg246162 - (view) Author: Graham Dumpleton (grahamd) Date: 2015-07-03 10:48
If this issue with GC can't be addressed and sub interpreters isolated better, then there is no point pursing then the idea that has been raised at the language summit of giving each sub interpreter its own GIL and then provide mechanisms to allow code executing in one sub interpreter to delegate other code to run in the context of a different sub interpreter, thus allowing effective use of multi core systems.

That was the bigger goal and this was one of the issues which would need to be fixed.
msg246163 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-07-03 11:00
I don't have any opinion on whether this issue kills the "parallelism with sub-interpreters" idea (I'm not sure why it would). But regardless, solving this issue will be very non-trivial, or perhaps very involved.

Running the GC at the end of a subinterpreter certainly won't solve the general problem of objects becoming unreachable in an interpreter (i.e. the last external reference being lost in that interpreter) and collected by the cyclic GC from another.
msg246165 - (view) Author: Graham Dumpleton (grahamd) Date: 2015-07-03 11:12
Right now mod_wsgi is the main user of sub interpreters. I wasn't even aware of this issue until Jesse found it. Thus in 7+ years, it never presented a problem in practice, possibly because in mod_wsgi sub interpreters are only ever destroyed on process shutdown and causing an issue at that point or a process crash was not noticed and tolerable

If however you are going to implement the "parallelism with sub-interpreters' idea you are making the possibility of encountering problems much more prevalent because you will likely have many more people using the feature, plus that a sub interpreter may be ephemeral and not necessarily kept around for the life of process, but destroyed at any time thus more readily pushing GC of objects into a different sub interpreter context if that is what can occur now.

It therefore seems to me that this would open up a huge can of worms if left to work as it does now with users seeing all sorts of unexpected behaviour if not very careful. Also, for GC of objects to be able to be done in a different interpreter context seems to suggest to me that the global GIL for whole process couldn't be eliminated in the first place. So at this I point can't see how you could move to a separate GIL for each interpreter context, if GC for each interpreter can't be separated easily or at all.
msg246166 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-07-03 11:35
We already knew that reference count management was likely to be one of the thorniest problems with allowing true subinterpreter level concurrency, this issue just reminds us that the cyclic GC is going to be a challenge as well.
msg341983 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-05-09 15:27
FYI, issue #36854 is about moving GC runtime state from _PyRuntimeState to PyInterpreterState.  However, that doesn't trigger any collection when the interpreter is finalized.  So there is more to be done here.
msg357063 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-20 11:28
bpo-36854 has been fixed, so it's time to reconsider fixing this issue :-)
msg358064 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-12-09 10:14
This issue is partially fixed in the master branch. Extract of the finalize_interp_clear() function, called by Py_EndInterpreter():

    /* Clear interpreter state and all thread states */
    PyInterpreterState_Clear(tstate->interp);

    /* Trigger a GC collection on subinterpreters*/
    if (!is_main_interp) {
        _PyGC_CollectNoFail();
    }

gc.collect() is now called.

It's only "partially" fixed because I would prefer to trigger a GC collection before or during PyInterpreterState_Clear(). IMHO trigger it after PyInterpreterState_Clear() creates a risk of crash in finalizers written in C which don't handle well before called very late during Python finalization. After PyInterpreterState_Clear(), Python is basically unusable. All modules are cleared.
msg359593 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-08 13:36
Py_EndInterpreter() now calls gc.collect() at least twice: at the end of _PyImport_Cleanup() and in finalize_interp_clear(). I now consider the issue as fixed and so I close it.

The issue that I described in my previous comment can be enhanced/fixed later.
History
Date User Action Args
2022-04-11 14:58:18adminsetgithub: 68742
2020-05-15 01:13:31vstinnersetcomponents: + Subinterpreters, - Interpreter Core
title: GC should happen when a subinterpreter is destroyed -> [subinterpreters] GC should happen when a subinterpreter is destroyed
2020-01-08 13:36:19vstinnersetstatus: open -> closed
versions: + Python 3.9, - Python 3.5, Python 3.6
messages: + msg359593

resolution: fixed
stage: test needed -> resolved
2019-12-09 10:14:07vstinnersetmessages: + msg358064
2019-11-20 11:28:00vstinnersetnosy: + vstinner
messages: + msg357063
2019-08-22 02:27:14phsilvasetnosy: + phsilva
2019-05-09 15:27:36eric.snowsetmessages: + msg341983
2015-07-03 11:35:07ncoghlansetmessages: + msg246166
2015-07-03 11:12:25grahamdsetmessages: + msg246165
2015-07-03 11:00:16pitrousetmessages: + msg246163
2015-07-03 10:48:11grahamdsetmessages: + msg246162
2015-07-03 10:45:40pitrousetmessages: + msg246161
2015-07-03 10:41:18pitrousetmessages: + msg246160
2015-07-03 10:36:20pitrousetmessages: + msg246159
2015-07-03 10:30:53grahamdsetmessages: + msg246158
2015-07-03 10:25:16pitrousetnosy: + pitrou
messages: + msg246157
2015-07-03 00:41:17eric.snowcreate