classification
Title: Runtime finalization assumes all other threads have exited.
Type: behavior Stage: needs patch
Components: Interpreter Core, Subinterpreters Versions: Python 3.9, Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, nanjekyejoannah, ncoghlan, pablogsal, pconnell, pitrou, tim.peters, vstinner
Priority: normal Keywords:

Created on 2019-03-29 20:38 by eric.snow, last changed 2020-06-03 16:42 by vstinner.

Messages (10)
msg339143 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-03-29 20:38
Among the first 3 things that happen in Py_FinalizeEx() are, in order:

1. wait for all non-daemon threads (of the main interpreter) to finish
2. call registered atexit funcs
3. mark the runtime as finalizing

At that point the only remaining Python threads are:

* the main thread (where finalization is happening)
* daemon threads
* non-daemon threads created in atexit functions
* any threads belonging to subinterpreters

The next time any of those threads (aside from main) acquire the GIL, we expect that they will exit via a call to PyThread_exit_thread() (caveat: issue #36475).  However, we have no guarantee on when that will happen, if ever.  Such lingering threads can cause problems, including crashes and deadlock (see issue #36469).

I don't know what else we can do, beyond what we're already doing.  Any ideas?
msg339145 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-03-29 20:40
FYI, I've opened issue36477 to deal with the subinterpreters case.
msg358745 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-12-20 23:30
Adding to the list:

* any OS threads created by an extension module or embedding application
msg358747 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-12-20 23:40
Problems with lingering threads during/after runtime finalization continue to be a problem.  I'm going to use this issue as the focal point for efforts to resolve this.


Related issues:
* #36479 "Exit threads when interpreter is finalizing rather than runtime."
* #24770 "Py_Finalize() doesn't stop daemon threads"
* #23592 "SIGSEGV on interpreter shutdown, with daemon threads running wild"
* #37127 "Handling pending calls during runtime finalization may cause problems."
* #33608 "Add a cross-interpreter-safe mechanism to indicate that an object may be destroyed."
* #36818 "Add PyInterpreterState.runtime."
* #36724 "Clear _PyRuntime at exit"
* #14073 "allow per-thread atexit()"
* #1596321 "KeyError at exit after 'import threading' in other thread"
* #37266 "Daemon threads must be forbidden in subinterpreters"
* #31517 "MainThread association logic is fragile"
msg358749 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-12-20 23:48
Analysis by @pconnell:

* https://bugs.python.org/issue33608#msg357169
* https://bugs.python.org/issue33608#msg357170
* https://bugs.python.org/issue33608#msg357179

tl;dr daemon threads and external C-API access during/after runtime finalization are causing crashes.
msg358750 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-12-20 23:54
To put it another way:

(from issue33608#msg358748)

> The docs [1] aren't super clear about it, but there are some fundamental
> assumptions we make about runtime finalization:
>
> * no use of the C-API while Py_FinalizeEx() is executing (except for a
> few helpers like Py_Initialized)
> * only a small portion of the C-API is available afterward (at least
> until Py_Initialize() is run)
>
> I guess the real question is what to do about this?
> 
>[1] https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx

Adding to that list:

* no other Python threads are running once we start finalizing the runtime (not far into Py_FinalizeEx())
msg358751 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-12-21 00:07
So I see 3 things to address here:

1. Python daemon threads
2. Python threads created in atexit handlers
3. non-Python threads accessing the C-API

Possible solutions (starting point for discussion):

1. stop them at the point we stop waiting for non-daemon threads (at the beginning of finalization)
2. disallow them?  do one more pass of wait-for-threads?
3. cause all (external) attempts to access the C-API to fail once finalization begins

Regarding daemon threads, the docs already say "Daemon threads are abruptly stopped at shutdown." [1]  So let's force them to stop.  Can we do that?  If we *can* simply kill the threads, can we do so without leaking resources?  Regardless, the mechanism we currently use (check for finalizing each(?) time through the eval loop) mostly works fine.  The problem is when C code called from Python in a daemon thread blocks long enough that it makes C-API calls (or even the eval loop) *after* we've started cleaning up the runtime state.  So if there was a way to interrupt that blocking code, that would probably be good enough.

The other two possible solutions are, I suppose, a bit more drastic.  What are the alternatives?


[1] https://docs.python.org/3/library/threading.html#thread-objects
msg358782 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-12-21 20:15
> 1. Python daemon threads

I think the answer is to document a bit more clearly that they can pose all kinds of problems.  Perhaps we could even display a visible warning when people create daemon threads.

> 2. Python threads created in atexit handlers

We could run the "join non-daemon threads" routine a *second time* after atexit handlers have been called.  It probably can't hurt (unless people do silly things?).

> 3. non-Python threads accessing the C-API

This one I don't know how to handle. By construction, a non-Python thread can do anything it wants, and we cannot add guards against this at the beginning of each C API function. I think that when someone calls the C API, we're clearly in the realm of "consenting adults".
msg359101 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-12-31 05:50
Perhaps we need a threading.throw() API, similar to the one we have for generators and coroutines?

If we had that, then Py_FinalizeEx() could gain a few new features:

* throw SystemExit into all daemon threads and then give them a chance to terminate before calling any atexit handlers (printing a warning if some of the threads don't exit)
* throw SystemExit into all daemon and non-daemon threads after running atexit handlers (printing a warning if any such threads exist at all, along with another warning if some of the threads don't exit)

Adding that would require an enhancement to the PendingCall machinery, though, since most pending calls are only processed in the main thread (there's no way to route them to specific child threads).

A simpler alternative would be to have an atomic "terminate_threads" counter in the ceval runtime state that was incremented to 1 to request that SystemExit be raised in daemon threads, and then to 2 to request that SystemExit be raised in all still running threads. When a thread received that request to exit, it would set a new flag in the thread state to indicate it was terminating, and then raise SystemExit. (The thread running Py_FinalizeEx would set that flag in advance so it wasn't affected, and other threads would use it to ensure they only raised SystemExit once). The runtime cost of this would just be another _Py_atomic_load_relaxed call in the eval_breaker branch. (Probably inside `make_pending_calls`, so it gets triggered both by the eval_breaker logic, and by explicit calls to `Py_MakePendingCalls`).
msg359103 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2019-12-31 06:30
Thinking about that idea further, I don't think that change would help much, since the relevant operations should already be checking for thread termination when they attempt to reacquire the GIL.

That means what we're missing is:

1. When daemon threads still exist after the non-daemon threads terminate, deliberately giving them additional time to run (and hence terminate)
2. Explicitly attempting to kick daemon threads out of blocking system calls by sending them signals to provoke EINTR (I have no idea if there's a windows equivalent for this, but we should be able to use pthread_kill on POSIX systems. However, choosing *which* wakeup signal to send could be fraught with compatibility problems)
History
Date User Action Args
2020-06-03 16:42:03vstinnersetcomponents: + Subinterpreters
2019-12-31 06:30:46ncoghlansetmessages: + msg359103
2019-12-31 05:50:56ncoghlansetmessages: + msg359101
2019-12-21 20:15:13pitrousetmessages: + msg358782
2019-12-21 00:07:29eric.snowsetnosy: + tim.peters, ncoghlan, pitrou, vstinner, pablogsal, nanjekyejoannah
messages: + msg358751
2019-12-20 23:54:01eric.snowsetmessages: + msg358750
2019-12-20 23:48:58eric.snowsetmessages: + msg358749
2019-12-20 23:44:55eric.snowsetnosy: + pconnell
2019-12-20 23:40:58eric.snowsetmessages: + msg358747
2019-12-20 23:33:07eric.snowlinkissue24770 superseder
2019-12-20 23:30:22eric.snowsetstage: needs patch
versions: + Python 3.9, - Python 3.7
2019-12-20 23:30:08eric.snowsetmessages: + msg358745
2019-03-29 20:40:12eric.snowsetmessages: + msg339145
2019-03-29 20:38:26eric.snowcreate