This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Documented interaction of single-stage init and sub-interpreters inaccurate
Type: Stage:
Components: Documentation Versions: Python 3.10, Python 3.9, Python 3.8, Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, pkerling
Priority: normal Keywords:

Created on 2020-11-09 17:51 by pkerling, last changed 2022-04-11 14:59 by admin.

Messages (1)
msg380602 - (view) Author: (pkerling) * Date: 2020-11-09 17:51
The C API documentation says this about single-phase initialization and sub-interpreters:

"[T]he first time a particular extension is imported, it is initialized normally, and a (shallow) copy of its module’s dictionary is squirreled away. When the same extension is imported by another (sub-)interpreter, a new module is initialized and filled with the contents of this copy; the extension’s init function is not called."
- from https://docs.python.org/3.10/c-api/init.html#c.Py_NewInterpreter

I was investigating crashes relating to the _datetime module and sub-interpreters and from my observations, this does not seem to be true.
I have tracked this functionality down to the m_base.m_copy dictionary of the PyModuleDef of an extension and the functions _PyImport_FixupExtensionObject and _PyImport_FindExtensionObject in Python/import.c. However, modules are only ever added to the `extensions` global when imported in the main interpreter, see https://github.com/python/cpython/blob/1f73c320e2921605c4963e202f6bdac1ef18f2ce/Python/import.c#L480

Furthermore, even if they were added and m_base.m_copy was set, it would be cleared again on sub-interpreter shutdown here: https://github.com/python/cpython/blob/1f73c320e2921605c4963e202f6bdac1ef18f2ce/Python/pystate.c#L796 - implying that the module will be loaded and initialized again next time due to this check: https://github.com/python/cpython/blob/1f73c320e2921605c4963e202f6bdac1ef18f2ce/Python/import.c#L556

These observations are supported by the fact that in my tests, if "import _datetime" is ran subsequently in two different sub-interpreters, PyInit__datetime is indeed called twice.

Test code - set a breakpoint on PyInit__datetime to observe the behavior:

#include <Python.h>
#include <assert.h>
int main()
{
    Py_Initialize();
    for (int i = 0; i < 100; ++i) {
        PyThreadState* ts = Py_NewInterpreter();
        assert(ts);
        int result = PyRun_SimpleString("import _datetime");
        assert(result == 0);
        Py_EndInterpreter(ts);
    }
    return 0;
}

In summary, it seems to me that the documented behavior is not accurate (any more?) - so either the docs or the implementation should change.
History
Date User Action Args
2022-04-11 14:59:37adminsetgithub: 86464
2020-11-09 17:51:42pkerlingcreate