classification
Title: [subinterpreters] Lingering subinterpreters should be implicitly cleared on shutdown
Type: crash Stage: patch review
Components: Subinterpreters Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: LewisGaul, eric.snow, maciej.szulik, mdk, nanjekyejoannah, ncoghlan, ned.deily, petr.viktorin, vstinner
Priority: normal Keywords: patch

Created on 2019-03-07 13:17 by ncoghlan, last changed 2020-10-27 02:51 by vstinner.

Files
File name Uploaded Description Edit
finalise-subinterps-test.diff LewisGaul, 2019-11-21 18:31
Pull Requests
URL Status Linked Edit
PR 17575 open LewisGaul, 2019-12-11 16:55
Messages (17)
msg337392 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-03-07 13:17
https://docs.python.org/3/c-api/init.html#c.Py_EndInterpreter states that "Py_FinalizeEx() will destroy all sub-interpreters that haven’t been explicitly destroyed at that point."

As discussed in https://github.com/hexchat/hexchat/issues/2237, Python 3.7+ doesn't currently do that - it calls Py_FatalError instead.

That change came from https://github.com/python/cpython/pull/1728, which was based on my initial PEP 432 refactoring work, and I didn't realise that implicitly cleaning up lingering subinterpreters was a documented behaviour.

So I think we should just fix it to behave as documented, and add a new regression test to make sure it doesn't get broken again in the future.
msg337743 - (view) Author: Joannah Nanjekye (nanjekyejoannah) * (Python committer) Date: 2019-03-12 14:24
I have been wondering where the regression to test this can be put..in test__xxsubinterpreters.py may be?
msg337852 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2019-03-13 14:47
Joannah, yes, that looks like a good place. Eric Snow might have more info; he wrote that module.

As for testing Py_FatalError, there's an assert_python_failure function in test.support.script_helper.
msg339149 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-03-29 21:02
Interestingly, I noticed this independently today. :)

Here's what I wrote in #36477 (which I've closed as a duplicate):

When using subinterpreters, any that exist when Py_FinalizeEx() is called do not appear to get cleaned up during runtime finalization.  Maybe I've been looking at the code too much and I'm missing something. :)

This really isn't a problem except for embedders that use subinterpreters (where we're leaking memory).  However, even with the "python" executable it can have an impact because the subinterpreters' non-daemon threads will exit later than expected. (see #36469 & #36476)

The solution would be to finalize all subinterpreters at the beginning of Py_FinalizeEx(), right before the call to wait_for_thread_shutdown().  This means calling Py_EndInterpreter() for all the runtime's interpreters (except the main one).  It would also mean setting a flag (_PyRuntime.interpreters.finalizing?) right before that to disallow creation of any more subinterptreters.
msg339150 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-03-29 21:07
test__xxsubinterpreters is a great place for tests that exercise use of subinterpreters, including most lifecycle operations.  There are also one or two subinterpreter-related tests in test_embed.  However, for this issue the interplay with runtime finalization means tests should probably stay with other tests that exercise Py_FinalizeEx().
msg339191 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-03-30 12:09
I think test_embed would be the right home for this, as there's an existing test case there for subinterpreter lifecycles and repeated init/finalize cycles: https://github.com/python/cpython/blob/ddbb978e1065dde21d1662386b26ded359f4b16e/Programs/_testembed.c#L43

The test case here would be similar, but it wouldn't need the outer loop - it would just create a handful of subinterpreters, but instead of ending each one before creating the next one the way the existing test does, what it would instead do is:

* setup as per the existing test case
* create a pair of subinterpeters, using a copy of the existing loop, but omitting the `Py_EndInterpreter` call
* switch back to the main interpreter
* create a second pair of subinterpeters
* switch back to the main interpreter
* call Py_Finalize

It also occurs to me that we don't currently have a test case for what happens if you call Py_Finalize from a subinterpreter rather than the main interpreter.
msg345802 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2019-06-17 07:24
Ping.  It was marked as a 3.7regression but perhaps it should now be just targeted for 3.8.
msg349119 - (view) Author: Joannah Nanjekye (nanjekyejoannah) * (Python committer) Date: 2019-08-06 17:25
I am investigating this but in the meantime.

> It also occurs to me that we don't currently have a test case for what happens if you call Py_Finalize from a sub-interpreter rather than the main interpreter.

@ncoghlan Am moving this test request in a new issue. So that this issue only focuses on fixing the lingering sub-interpreters.
msg349124 - (view) Author: Joannah Nanjekye (nanjekyejoannah) * (Python committer) Date: 2019-08-06 17:44
The test request is moved to issue37776.
msg353873 - (view) Author: Joannah Nanjekye (nanjekyejoannah) * (Python committer) Date: 2019-10-03 18:50
I remember julien wanting to check this out during a discussion we had at the sprints hence the loop in.
msg357193 - (view) Author: Lewis Gaul (LewisGaul) * Date: 2019-11-21 18:31
I've put together a test along the lines of what Nick suggested, see the attached patch. Running this hits the Fatal 'remaining subinterpreters' error as expected:

```
   > ./Programs/_testembed test_bpo36225
--- Pass 0 ---
interp 0 <0x1A561A0>, thread state <0x1A56DD0>: id(modules) = 139707107673104
interp 1 <0x1A9D050>, thread state <0x1A87620>: id(modules) = 139707106987472
interp 2 <0x1B02210>, thread state <0x1AEC320>: id(modules) = 139706981531088
interp 0 <0x1A561A0>, thread state <0x1A56DD0>: id(modules) = 139707107673104
interp 3 <0x1B53740>, thread state <0x1AFD980>: id(modules) = 139706980408304
interp 4 <0x1BA3390>, thread state <0x1B3C7A0>: id(modules) = 139706979780944
interp 0 <0x1A561A0>, thread state <0x1A56DD0>: id(modules) = 139707107673104
Fatal Python error: PyInterpreterState_Delete: remaining subinterpreters
Python runtime state: finalizing (tstate=0x1a56dd0)

Aborted
```

I'm happy to look a bit further into the fix for this - Eric's pointers in this thread look useful to get started. @nanjekyejoannah did you get anywhere with this?
msg357235 - (view) Author: Joannah Nanjekye (nanjekyejoannah) * (Python committer) Date: 2019-11-21 23:17
Am abit swamped and sick atm. You can go on and submit a fix.
msg364587 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-03-19 02:00
See also bpo-38865: "Can Py_Finalize() be called if the current interpreter is not the main interpreter?".
msg371338 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2020-06-12 08:30
I note this is marked as a 3.7regression and still open. Since the cutoff for the final 3.7 bugfix mode release is in a few days, I'm assuming this means that 3.7 users will have to live with this regression.  If you feel that is a problem, speak up now.
msg371571 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-15 16:36
I would not qualify the new Python 3.7 behavior (call Py_FatalError()) as a regression, so I remove "3.7regression" keyword.

Also, I don't think that we can easily fix this issue in a stable branch, I would prefer to start working on implementation this issue in the master branch, and only later consider to *maybe* backport the change. So I changed the Python version to "3.10".

I see multiple non trivial problems:

* To finalize the subinterpreter A, Py_Finalize() may switch the current Python thread state from the main interpreter to a Python thread in the subintrepeter A. It can lead to new funny issues.

* A subinterpreter can be stuck for whatever reason and refuse to stop. For example, the subinterpreter A may wait for an even from subinterpreter B. If we don't run two interpreters "in parallel" (in threads), it may be stuck forever.

* Python 3.10 still has weird code to allow daemon threads to continue to run after Py_Finalize() complete. Maybe we need to add such hacks for subinterpreter which fail to be finalized? For example, kill them (exit) as soon as they attempt to acquire the GIL. Search for tstate_must_exit() in Python/ceval_gil.h.


By the way, currently Py_Finalize() calls PyInterpreterState_Clear() which call object finalizers in the main thread of the main interpreter, whereas these finalizers might expect to be called from the thread which created them. I don't know if we must change anything else.

Again, all these problems are very complex :-(

The simple option, which is sadly a backward incompatible change, is to raise a fatal error in Py_Finalize() if a subinterpreter is still running.

Maybe we can start by logging a warning into stderr for now, before introducing the backward incompatible change. Maybe even only emit it when -X dev is used? https://docs.python.org/dev/library/devmode.html
msg379366 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2020-10-22 21:48
> I see multiple non trivial problems:
>
> * To finalize the subinterpreter A, Py_Finalize() may switch the current Python thread state from the main interpreter to a Python thread in the subintrepeter A. It can lead to new funny issues.

I'm not sure what you mean with this.  Are you talking about how another thread might be running that threadstate already?  That doesn't sounds like a problem specific to this issue but rather a problem with Py_EndInterpreter() in general.

> * A subinterpreter can be stuck for whatever reason and refuse to stop. For example, the subinterpreter A may wait for an even from subinterpreter B. If we don't run two interpreters "in parallel" (in threads), it may be stuck forever.

The wait_for_thread_shutdown() call in Py_FinalizeEx() already has a similar behavior.  In the case of one interpreter blocking another like that, how would that be possible where it isn't already a problem?

> * Python 3.10 still has weird code to allow daemon threads to continue to run after Py_Finalize() complete. Maybe we need to add such hacks for subinterpreter which fail to be finalized? For example, kill them (exit) as soon as they attempt to acquire the GIL. Search for tstate_must_exit() in Python/ceval_gil.h.

I consider the problem of daemon threads to be a separate problem.  The solution proposed here for finalization doesn't change any behavior regarding daemon threads.

> By the way, currently Py_Finalize() calls PyInterpreterState_Clear() which call object finalizers in the main thread of the main interpreter, whereas these finalizers might expect to be called from the thread which created them.

Do you have any examples?  I'm having trouble imagining such a case.

> I don't know if we must change anything else.
>
> Again, all these problems are very complex :-(

I agree.  However, automatically finalizing all other interpreters at the beginning of Py_FinalizeEx() doesn't introduce any new problems.  At the same time, it makes sure that any resources in use in other interpreters have a chance to be released and that global resources used there don't cause crashes during runtime finalization.

> The simple option, which is sadly a backward incompatible change, is to raise a fatal error in Py_Finalize() if a subinterpreter is still running.

As long as we require Py_FinalizeEx() to be called from the main interpreter, I don't see how the proposed change (in PR #17575) would be backward incompatible.

> Maybe we can start by logging a warning into stderr for now, before introducing the backward incompatible change. Maybe even only emit it when -X dev is used? https://docs.python.org/dev/library/devmode.html

I'd like to see a ResourceWarning if any other interpreters were still around.
msg379718 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-27 02:51
If tomorrow Python allows to run two interpreters in parallel (see bpo-40512: "Meta issue: per-interpreter GIL"), I don't think that it will be safe to execute object finalizers of a subinterpreter from the main interpreter in Py_Finalize().

IMO subinterpreters must be finalized manually by the programmer, since it's too tricky.

The safest option is to display a warning in Py_Finalize() if there are running subinterpreters.
History
Date User Action Args
2020-10-27 02:51:47vstinnersetmessages: + msg379718
2020-10-22 21:48:12eric.snowsetmessages: + msg379366
2020-06-15 16:36:19vstinnersetkeywords: - 3.7regression

messages: + msg371571
versions: + Python 3.10, - Python 3.7, Python 3.8, Python 3.9
2020-06-12 08:30:36ned.deilysetmessages: + msg371338
2020-05-15 00:40:02vstinnersetcomponents: + Subinterpreters
title: Lingering subinterpreters should be implicitly cleared on shutdown -> [subinterpreters] Lingering subinterpreters should be implicitly cleared on shutdown
2020-03-19 02:00:56vstinnersetnosy: + vstinner
messages: + msg364587
2020-02-07 15:08:59maciej.szuliksetnosy: + maciej.szulik
2019-12-11 16:55:39LewisGaulsetstage: needs patch -> patch review
pull_requests: + pull_request17048
2019-11-21 23:17:06nanjekyejoannahsetmessages: + msg357235
2019-11-21 18:31:13LewisGaulsetfiles: + finalise-subinterps-test.diff

nosy: + LewisGaul
messages: + msg357193

keywords: + patch
2019-10-03 18:50:59nanjekyejoannahsetnosy: + mdk
messages: + msg353873
2019-08-06 17:44:25nanjekyejoannahsetmessages: + msg349124
2019-08-06 17:25:04nanjekyejoannahsetmessages: + msg349119
2019-06-17 07:24:21ned.deilysetnosy: + ned.deily

messages: + msg345802
versions: + Python 3.9
2019-03-30 12:09:07ncoghlansetmessages: + msg339191
2019-03-29 21:07:25eric.snowsetmessages: + msg339150
2019-03-29 21:02:51eric.snowsetmessages: + msg339149
2019-03-29 21:00:35eric.snowlinkissue36477 superseder
2019-03-13 14:47:17petr.viktorinsetmessages: + msg337852
2019-03-12 14:24:25nanjekyejoannahsetmessages: + msg337743
2019-03-07 14:17:18nanjekyejoannahsetnosy: + nanjekyejoannah
2019-03-07 13:17:43ncoghlancreate