msg368136 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-05 12:51 |
To be able to run multiple (sub)interpreters in parallel, the unique global interpreter lock aka "GIL" should be replace with multiple GILs: one "GIL" per interpreter. The scope of such per-interpreter GIL would be a single interpreter.
The current CPython code base is not fully read to have one GIL per interpreter. TODO:
* Move signals pending and gil_drop_request from _PyRuntimeState.ceval to PyInterpreterState.ceval: https://github.com/ericsnowcurrently/multi-core-python/issues/34
* Add a lock to pymalloc, or disable pymalloc when subinterpreters are used: https://github.com/ericsnowcurrently/multi-core-python/issues/30
* Make free lists per interpreters: tuple, dict, frame, etc.
* Make Unicode interned strings per interpreter
* Make Unicode latin1 single character string singletons per interpreter
* None, True, False, ... singletons: make them per-interpreter (bpo-39511) or immortal (bpo-40255)
* etc.
Until we can ensure that no Python object is shared between two interpreters, we might make PyObject.ob_refcnt, PyGC_Head (_gc_next and _gc_prev) and _dictkeysobject.dk_refcnt atomic.
C extension modules should be modified as well:
* Convert to PEP 489 multi-phase initialization
* Replace globals ("static" variables) with a module state, or design a new "per-interpreter" local storage similar to Thread Local Storage (TLS). There is already PyInterpreterState.dict which is convenient to use in "Python" code, but it's not convenient to use in "C" code (code currently using "static int ..." for example).
I'm not sure how to handle C extensions which are binding for a C library which has a state and so should not be used multiple times in parallel. Some C extensions use a "global lock" for that. The question is how to get
Most of these tasks are already tracked in Eric Snow's "Multi Core Python" project:
https://github.com/ericsnowcurrently/multi-core-python/issues
This issue is related to PEP 554 "Multiple Interpreters in the Stdlib", but not required by this PEP.
This issue is a tracker for sub-issues related to the goal "have one GIL per interpreter".
--
Some changes have a negative impact on "single threaded" Python application. Even if the overhead is low, one option to be able to move faster on this issue may be to add a new temporary configure option to have an opt-in build mode to better isolate subinterpreters. Examples:
* disable pymalloc
* atomic reference counters
* disable free lists
That would be a temporary solution to "unblock" the development on this list. For the long term, free lists should be made per-interpreter, pymalloc should support multiple interpreters, no Python object must be shared by two interpreters, etc.
--
One idea to detect if a Python object is shared by two interpreters *in debug mode* would be to store a reference to the interpreter which created it, and then check if the current interpreter is the same. If not, fail with a Python Fatal Error.
--
During Python 3.9 development cycle, many states moved from the global _PyRuntimeState to per-interpreter PyInterpreterState:
* GC state (bpo-36854)
* warnings state (bpo-36737)
* small integer singletons (bpo-38858)
* parser state (bpo-36876)
* ceval pending calls and "eval breaker" (bpo-39984)
* etc.
Many corner cases related to daemon threads have also been fixed:
* https://vstinner.github.io/daemon-threads-python-finalization-python32.html
* https://vstinner.github.io/threading-shutdown-race-condition.html
* https://vstinner.github.io/gil-bugfixes-daemon-threads-python39.html
And more code is now shared for the initialization and finalization of the main interpreter and subinterpreters (ex: see bpo-38858).
Subinterpreters builtins and sys are now really isolated from the main interpreter (bpo-38858).
--
Obviously, there are likely tons of other issues which are not known at this stage. Again, this issue is a placeholder to track them all. It may be more efficient to create one sub-issue per sub-task, rather than discussing all tasks at the same place.
|
msg368138 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-05 12:57 |
> Move signals pending and gil_drop_request from _PyRuntimeState.ceval to PyInterpreterState.ceval: https://github.com/ericsnowcurrently/multi-core-python/issues/34
I created bpo-40513: "Move _PyRuntimeState.ceval to PyInterpreterState".
|
msg368142 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-05 13:14 |
> Some changes have a negative impact on "single threaded" Python application. Even if the overhead is low, one option to be able to move faster on this issue may be to add a new temporary configure option to have an opt-in build mode to better isolate subinterpreters. (...)
I created bpo-40514: "Add --experimental-isolated-subinterpreters build option".
|
msg368184 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-05 17:39 |
I created bpo-40522: "Subinterpreters: get the current Python interpreter state from Thread Local Storage (autoTSSkey)".
|
msg368195 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-05 18:32 |
Attached demo.py: benchmark to compare performance of sequential execution, threads and subinterpreters.
|
msg368203 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-05 20:13 |
(oops, there was a typo in my script: threads and subinterpreters was the same benchmark)
|
msg368206 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-05 21:10 |
Hum, demo.py is not reliable for threads: the standard deviation is quite large. I rewrote it using pyperf to compute the average and the standard deviation.
|
msg368210 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-05 21:53 |
I updated demo-pyperf.py to also benchmark multiprocessing.
|
msg368272 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-06 15:59 |
I created bpo-40533: "Subinterpreters: don't share Python objects between interpreters".
|
msg368310 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-06 23:07 |
See also bpo-39465: "Design a subinterpreter friendly alternative to _Py_IDENTIFIER". Currently, this C API is not compatible with subinterpreters.
|
msg368670 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-11 22:25 |
"Static" types are shared by all interpreters. We should convert them to heap allocated types using PyType_FromSpec(), see:
* bpo-40077: Convert static types to PyType_FromSpec()
* bpo-40601: [C API] Hide static types from the limited C API
|
msg368839 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-14 14:11 |
> Add a lock to pymalloc, or disable pymalloc when subinterpreters are used: (...)
By the way, tracemalloc is not compatible with subinterpreters.
test.support.run_in_subinterp() skips the test if tracemalloc is tracing.
|
msg368908 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-15 01:32 |
I marked bpo-36877 "[subinterpreters][meta] Move fields from _PyRuntimeState to PyInterpreterState" as a duplicate of this issue.
|
msg368914 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-05-15 01:55 |
I created a new "Subinterpreters" component in the bug tracker. It may help to better track all issues related to subinterpreters.
|
msg370608 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-06-02 12:46 |
Currently, the import lock is shared by all interpreters. It would also help for performance to make it per-interpreter to parallelize imports.
|
msg372253 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-06-24 13:55 |
Update of the EXPERIMENTAL_ISOLATED_SUBINTERPRETERS status.
I made many free lists and singletons per interpreter in bpo-40521.
TODO:
* _PyUnicode_FromId() and interned strings are still shared: typeobject.c requires a workaround for that.
* GC is disabled in subinterpreters since some objects are still shared
* Type method cache is shared.
* pymalloc is shared.
* The GIL is shared.
I'm investigating performance of my _PyUnicode_FromId() PR: https://github.com/python/cpython/pull/20058
This PR now uses "atomic functions" proposed in a second PR: https://github.com/python/cpython/pull/20766
The "atomic functions" avoids the need to have to declare a variable or a structure member as atomic, which would cause different issues if they are declared in Python public headers (which is the case for _Py_Identifier used by _PyUnicode_FromId()).
|
msg372773 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-07-01 17:30 |
> Update of the EXPERIMENTAL_ISOLATED_SUBINTERPRETERS status.
Also:
* _PyLong_Zero and _PyLong_One singletons are shared
* Py_None, Py_True and Py_False singletons are shared: bpo-39511 and PR 18301
* Static types like PyUnicode_Type and PyLong_Type are shared: see bpo-40077 and bpo-40601
* The dictionary of Unicode interned strings is shared: PR 20085
* context.c: _token_missing singleton is shared
* "struct _PyArg_Parser" generated by Argument Clinic is shared: see _PyArg_Fini()
Misc notes:
* init_interp_main(): if sys.warnoptions is not empty, "import warnings" is called to process these options, but not in subinterpreters: only in the main intepreter.
* _PyImport_FixupExtensionObject() contains code specific to the main interpreter. Maybe this function will not longer be needed once builtin extension modules will be converted to PEP 489 "multiphase initialization" API. I'm not sure.
|
msg380102 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-10-31 22:25 |
> * _PyLong_Zero and _PyLong_One singletons are shared
Removed by bpo-42161 (commit c310185c081110741fae914c06c7aaf673ad3d0d).
|
msg380108 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-10-31 23:45 |
FYI I'm also using https://pythondev.readthedocs.io/subinterpreters.html to track the progress on isolating subinterpreters.
|
msg380323 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-11-04 13:58 |
See also bpo-15751: "Make the PyGILState API compatible with subinterpreters".
|
msg383780 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-12-25 23:39 |
> Type method cache is shared.
I created bpo-42745: "[subinterpreters] Make the type attribute lookup cache per-interpreter".
|
msg383830 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-12-26 22:10 |
I played with ./configure --with-experimental-isolated-subinterpreters. I tried to run "pip list" in parallel in multiple interpreters.
I hit multiple issues:
* non-atomic reference count of Python objects shared by multiple interpreters, objects shared via static types for example.
* resolve_slotdups() uses a static variable
* pip requires _xxsubinterpreters.create(isolated=False): the vendored distro package runs the lsb_release command with subprocess.
* Race conditions in PyType_Ready() on static types:
* Objects/typeobject.c:5494: PyType_Ready: Assertion "(type->tp_flags & (1UL << 13)) == 0" failed
* Race condition in add_subclass()
* parser_init() doesn't support subinterpreters
* unicode_dealloc() fails to delete an interned string in the Unicode interned dictionary => https://bugs.python.org/issue40521#msg383829
To run "pip list", I used:
CODE = """
import runpy
import sys
import traceback
sys.argv = ["pip", "list"]
try:
runpy.run_module("pip", run_name="__main__", alter_sys=True)
except SystemExit:
pass
except Exception as exc:
traceback.print_exc()
print("BUG", exc)
raise
"""
|
msg383831 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-12-26 22:23 |
> * resolve_slotdups() uses a static variable
Attached resolve_slotdups.patch works around the issue by removing the cache.
|
msg383875 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-12-27 23:21 |
FYI I wrote an article about this issue: "Isolate Python Subinterpreters"
https://vstinner.github.io/isolate-subinterpreters.html
|
msg388746 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-03-15 14:40 |
> * Add a lock to pymalloc, or disable pymalloc when subinterpreters are used: https://github.com/ericsnowcurrently/multi-core-python/issues/30
See bpo-43313: "feature: support pymalloc for subinterpreters. each subinterpreter has pymalloc_state".
|
msg399847 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-08-18 14:36 |
PyStructSequence_InitType2() is not compatible with subinterpreters: it uses static types. Moreover, it allocates tp_members memory which is not released when the type is destroyed. But I'm not sure that the type is ever destroyed, since this API is designed for static types.
|
msg401117 - (view) |
Author: Hai Shi (shihai1991) * |
Date: 2021-09-06 05:17 |
> PyStructSequence_InitType2() is not compatible with subinterpreters: it uses static types. Moreover, it allocates tp_members memory which is not released when the type is destroyed. But I'm not sure that the type is ever destroyed, since this API is designed for static types.
IMO, I suggest to create a new function, PyStructSequence_FromModuleAndDesc(module, desc, flags) to create a heaptype and don't aloocates memory block for tp_members,something like 'PyType_FromModuleAndSpec()`.
I don't know there have any block issue to do this converting operation. But I can take a look.
@petr ping, Petr, do you have any better idea about this question :)
|
msg401127 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2021-09-06 09:44 |
Hai Shi:
> IMO, I suggest to create a new function, PyStructSequence_FromModuleAndDesc()
Please create a new issue. If possible, I would prefer to have a sub-issue for that, to keep this issue as a tracking issue for all issues related to subinterpreters.
|
msg401130 - (view) |
Author: Hai Shi (shihai1991) * |
Date: 2021-09-06 11:13 |
bpo-45113: [subinterpreters][C API] Add a new function to create PyStructSequence from Heap.
|
msg403727 - (view) |
Author: Petr Viktorin (petr.viktorin) * |
Date: 2021-10-12 12:05 |
PyStructSequence_NewType exists, and is the same as the proposed PyStructSequence_FromModuleAndDesc except it doesn't take the module (which isn't necessary: PyStructSequence_Desc has no way to define functionality that would need the module state).
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:30 | admin | set | github: 84692 |
2022-03-06 03:51:12 | jon | set | nosy:
+ jon
|
2021-12-08 21:27:46 | ndjensen | set | nosy:
+ ndjensen
|
2021-10-26 14:00:23 | Mark.Shannon | set | nosy:
+ Mark.Shannon
pull_requests:
+ pull_request27492 stage: patch review |
2021-10-12 12:05:15 | petr.viktorin | set | messages:
+ msg403727 |
2021-09-06 11:13:18 | shihai1991 | set | messages:
+ msg401130 versions:
+ Python 3.11, - Python 3.10 |
2021-09-06 09:44:51 | vstinner | set | messages:
+ msg401127 |
2021-09-06 05:17:55 | shihai1991 | set | nosy:
+ petr.viktorin messages:
+ msg401117
|
2021-08-18 14:36:00 | vstinner | set | messages:
+ msg399847 |
2021-03-15 14:40:14 | vstinner | set | messages:
+ msg388746 |
2021-02-18 13:13:34 | nw0 | set | nosy:
+ nw0
|
2020-12-31 10:11:26 | alex-garel | set | nosy:
+ alex-garel
|
2020-12-27 23:21:46 | vstinner | set | messages:
+ msg383875 |
2020-12-26 22:23:54 | vstinner | set | files:
+ resolve_slotdups.patch keywords:
+ patch messages:
+ msg383831
|
2020-12-26 22:10:15 | vstinner | set | messages:
+ msg383830 |
2020-12-25 23:39:03 | vstinner | set | messages:
+ msg383780 |
2020-11-04 13:58:01 | vstinner | set | messages:
+ msg380323 |
2020-10-31 23:45:07 | vstinner | set | messages:
+ msg380108 |
2020-10-31 23:10:46 | erlendaasland | set | nosy:
+ erlendaasland
|
2020-10-31 22:25:03 | vstinner | set | messages:
+ msg380102 |
2020-07-01 17:30:20 | vstinner | set | messages:
+ msg372773 |
2020-06-24 13:55:09 | vstinner | set | messages:
+ msg372253 |
2020-06-02 12:46:45 | vstinner | set | messages:
+ msg370608 versions:
+ Python 3.10, - Python 3.9 |
2020-05-15 01:55:20 | vstinner | set | messages:
+ msg368914 |
2020-05-15 01:32:00 | vstinner | set | messages:
+ msg368908 |
2020-05-15 01:31:54 | vstinner | link | issue36877 superseder |
2020-05-15 00:35:28 | vstinner | set | components:
+ Subinterpreters, - Interpreter Core title: Meta issue: per-interpreter GIL -> [subinterpreters] Meta issue: per-interpreter GIL |
2020-05-14 14:11:45 | vstinner | set | messages:
+ msg368839 |
2020-05-11 22:25:02 | vstinner | set | messages:
+ msg368670 |
2020-05-06 23:07:16 | vstinner | set | messages:
+ msg368310 |
2020-05-06 15:59:32 | vstinner | set | messages:
+ msg368272 |
2020-05-06 04:47:58 | shihai1991 | set | nosy:
+ shihai1991
|
2020-05-05 21:53:11 | vstinner | set | messages:
+ msg368210 |
2020-05-05 21:52:48 | vstinner | set | files:
- demo.py |
2020-05-05 21:52:47 | vstinner | set | files:
- demo-pyperf.py |
2020-05-05 21:52:39 | vstinner | set | files:
+ demo-pyperf.py |
2020-05-05 21:10:07 | vstinner | set | files:
+ demo-pyperf.py
messages:
+ msg368206 |
2020-05-05 20:13:59 | vstinner | set | files:
+ demo.py
messages:
+ msg368203 |
2020-05-05 20:13:30 | vstinner | set | files:
- demo.py |
2020-05-05 18:32:48 | vstinner | set | files:
+ demo.py
messages:
+ msg368195 |
2020-05-05 17:39:00 | vstinner | set | messages:
+ msg368184 |
2020-05-05 17:10:57 | aeros | set | nosy:
+ aeros
|
2020-05-05 15:23:08 | corona10 | set | nosy:
+ corona10
|
2020-05-05 15:12:39 | vstinner | set | nosy:
+ eric.snow
|
2020-05-05 13:14:21 | vstinner | set | messages:
+ msg368142 |
2020-05-05 12:57:18 | vstinner | set | messages:
+ msg368138 |
2020-05-05 12:51:15 | vstinner | create | |