classification
Title: [subinterpreters] Disallow daemon threads in subinterpreters optionally
Type: behavior Stage: patch review
Components: Subinterpreters Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: aeros, eric.snow, grahamd, jcea, pitrou, vstinner
Priority: normal Keywords: patch

Created on 2020-04-08 23:16 by eric.snow, last changed 2020-05-15 01:39 by vstinner.

Pull Requests
URL Status Linked Edit
PR 19456 merged vstinner, 2020-04-10 14:14
Messages (9)
msg366029 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2020-04-08 23:16
In bpo-37266 we strictly disallowed creation of daemon threads in subinterpreters.  However, this is backward-incompatible for existing users of the subinterpreter C-API (such as mod-wsgi).

Rather than reverting that change I suggest that we make it opt-in through the interpreter config.  That would preserve backward-compatibility.  It would also make it so we can disallow daemon threads in subinterpreters created through PEP 554.  We could also deprecate use of daemon threads in *all* subinterpreters, with the goal of dropping support after a while.
msg366079 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-09 16:52
First of all, I know that mod_wsgi uses subinterpreters, but I didn't know that it uses daemon threads. When I hack Python to better isolate subinterpreters, I keep mod_wsgi in mind ;-)


My commit 066e5b1a917ec2134e8997d2cadd815724314252 message says: "Daemon threads were never supported in subinterpreters." That's inaccurate. Only the second sentence is correct: "Previously, the subinterpreter finalization crashed with a Pyton fatal error if a daemon thread was still running."

In Python 3.8, it is possible to use daemon thread but only if they complete before Py_EndInterpreter() is called.

I'm not comfortable with Python "crashing" (call Py_FatalError() which calls abort() which may generate a coredump) depending if all daemon threads complete or not. I'm trying to make Python finalization more deterministic.


My concern is this Py_EndInterpreter() check:

    if (tstate != interp->tstate_head || tstate->next != NULL) {
        Py_FatalError("not the last thread");
    }


This check is as old as subinterpreters, it was added at the same time than the Py_EndInterpreter() function:

commit 25ce566661c1b7446b3ddb4076513a62f93ce08d
Author: Guido van Rossum <guido@python.org>
Date:   Sat Aug 2 03:10:38 1997 +0000

    The last of the mass checkins for separate (sub)interpreters.
    Everything should now work again.
    
    See the comments for the .h files mass checkin (e.g. pystate.h) for
    more detail.


My concern is that Py_FatalError() exits the whole process, not only the thread calling Py_EndInterpreter(). The caller cannot catch Py_FatalError() call and decide how to handle it :-( That doesn't seem like a good API when Python is embedded in an application.


Recently, I had to deal  with (fix) many bugs related to daemon threads in Python finalization:

* https://vstinner.github.io/gil-bugfixes-daemon-threads-python39.html
* https://vstinner.github.io/threading-shutdown-race-condition.html
* https://vstinner.github.io/daemon-threads-python-finalization-python32.html

Simply denying daemon threads in subinterpreters is the safest and simplest option. But it's also a backward incompatible change.

I'm open to allow again daemon threads in subinterpreter in Python 3.9 but emit a DeprecationWarning to announce that this feature is going to be remove "later" (ex: Python 3.11) since it's "broken by design" (in my opinion).


> We could also deprecate use of daemon threads in *all* subinterpreters, with the goal of dropping support after a while.

It sounds like a good idea to make Python finalization more reliable. But daemon threads are still widely used in the Python standard library. For example, by the multiprocessing module.

concurrent.futures was only modified recently to stop using daemon threads, after I denied spawned daemon threads in subinterpreters:

* bpo-39812
* commit b61b818d916942aad1f8f3e33181801c4a1ed14b

So if someone wants to ban daemon threads, we should start by modifying the stdlib to avoid them, then deprecate the feature, and then we can start talking about removing the feature.
msg366080 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-09 16:55
Graham Dumpleton started a discussion about bpo-37266 on Twitter:
https://twitter.com/GrahamDumpleton/status/1246374493267701760
https://twitter.com/GrahamDumpleton/status/1246373855532216320

I dislike discussion threads on Twitter. For an unknown reason, I always fail to see all replies. I only see them in Mentions, not in the thread!?
msg366089 - (view) Author: Graham Dumpleton (grahamd) Date: 2020-04-09 22:38
Just to make few things clear. It isn't mod_wsgi itself that relies on daemon threads, it is going to be users WSGI applications (or the things they need) that do.

As a concrete example of things that would stop working are monitoring systems such as New Relic, DataDog, Elastic APM etc. These all fire off a background thread to handle aggregation of data collected from the application, with that data then being sent off once a minute to the backend servers.

It isn't just these though. Over the years have see many instances of people using background threads to off load small tasks to be done in process rather than using full blown queuing system such as Celery etc. So I don't believe it is a rare use case. Monitoring systems are a big use case though.

These would all usually use a daemon thread so they can be started and effectively forgotten, with no need to do anything to shut them down when the process is exiting.

Some (such as New Relic, which I wrote so know how it works), will register an atexit callback in order to flush data out before a process stops, but it may not actually exit the thread. Even if it does exit the thread, you can't just switch it to use a non daemon thread as that will not work.

The problem here is that atexit callbacks are only called after the (sub)interpreter shutdown code has waited on non daemon threads. Thus there is no current standard way I know of to notify a non daemon thread to shutdown. The result would be that if these were switched to non daemon thread, the process would hang on shutdown at the point of waiting for non daemon threads.

So if you are going to eliminate daemon threads (even if only in sub interpreters at this point), you are going to have to introduce a way to register something similar to an atexit callback which would be invoked before waiting on non daemon threads, so an attempt can be made to notify them that they need to shutdown. Use of this mechanism is going to have to be added to any code out there currently using daemon threads if they are going to be forced to use non daemon threads. This includes stuff in the stdlib such as the multiprocessing thread pools. They can't just switch to non daemon threads, they have to add the capability to register and be notified of (sub)interpreter shutdown so they can exit the thread else process hangs will occur.

Now a few other things about history and usage of mod_wsgi to give context.

Once upon a time mod_wsgi did try and delete sub interpreters and replace them in the life of a process. This as you can probably imagine now was very buggy because of issues in CPython sub interpreter support. As a result mod_wsgi discarded that ability and so a sub interpreter always persisted and was used for the life of the process. That way problems with clean up of sub interpreters wasn't a big issue.

During cleanup of (sub)interpreters on process shutdown, although crashes could sometimes occur (usually quite rare), what usually happened was that a Python exception would occur. The reason for this would be in cleaning up a (sub)interpreter, sys.modules was cleared up with everything appearing to be set to None. You would therefore get a Python exception because some code trying to access a class instance found the instance replaced by None and so it failed. Even this was rare and not a big deal.

Now although a crash or Python exception could in rare cases occur, for mod_wsgi it didn't really matter since we were talking about sub process of the Apache master process, and the master process didn't care. If Apache was stopping anyway, it just stopped normally. If Apache was doing a restart and child process were told to stop because of that, or if a maximum request threshold was reach and so process was being recycled, then Apache was going to replace the process anyway, so everything just carried on normally and a new process started in its place.

In the case where a process lockup managed to occur on process shutdown, for example if non daemon thread were used explicitly, then process shutdown timeouts applied by mod_wsgi on daemon processes would kick in and the process would be force killed anyway. So all up it was quite resilient and kept working. If embedded mode of mod_wsgi was used, it would though lock up the Apache process indefinitely if something used non daemon threads explicitly.

On the issue of non daemon threads, usually these would never arise. This is because usually people don't explicitly say a thread is non daemon. Where nothing is done to say that, a thread actually inherits the mode of the thread it was created in. Since all request handler threads in mod_wsgi are actually externally created threads which call into Python, they get assigned the DummyThread object to track them. These are treated as non daemon threads. As a result any new threads created which don't explicitly say they are non daemon, get marked as daemon threads anyway. The consequence of this is that they never get waited upon on shutdown and everything works.

Anyway, going forward, if use of daemon threads is blocked in sub interpreters to satisfy the new envisioned use case for them, of using them to run sub tasks out of another interpreter, then first thing would be that mod_wsgi would deprecate use of sub interpreters, disabling them by default and requiring people to explicitly enable ability to use them. This would be just an interim measure in a transition period.

In a followup version mod_wsgi would then discard support for sub interpreters, as well as likely also disable embedded mode (except for specific case of Apache access control hooks, would mean Windows support would still be dropped though). Thus people would be forced to use daemon mode of mod_wsgi where separate processes are used to the Apache child processes to run a WSGI application. This is actually was mod_wsgi-express effectively enforces already. If separation were need for separate applications, or if a single application needed to be split, separate groups of daemon processes would be used with requests redirected to the appropriate instance in a daemon process group.

This use of daemon processes, recommendation to not use sub interpreters and use the main interpreter context has existed for many years due to many third party packages not working in sub interpreters anyway due to simplified GIL API restrictions. So this isn't new, it just hasn't been required and enforced (except in mod_wsgi-express). So these steps just force people in the direction which has been recommended for a long time.

As to the question of why disable/discard sub interpreter support in mod_wsgi, that comes down to support burden. This is not support burden in mod_wsgi, but the effort it will take to deal with all those people out there whose applications will stop working if run in sub interpreters were daemon thread usage prevented. I don't have time to be hand holding all these people and educate or help them to fix their applications or tell them how some third party package they use needs to be changed. Will be easier and less impact on me as the only person who supports mod_wsgi to discard sub interpreter support and document how people need to move to use of daemon mode and main interpreter as has been recommended for a long time anyway but which couldn't be made the default purely because of history of how mod_wsgi was developed and features added over time.

Now later on if someone decides to eliminate daemon threads for the main interpreter context, or changes stdlib so everything uses non daemon threads, then at that point I would stop supporting mod_wsgi in those Python versions. I just feel that is going to have a huge impact on user code at that point and create lots of problems so don't even want to go there. The impact of dropping daemon threads from the main interpreter will likely have affects way beyond mod_wsgi as well, so right now I can't see how you could even make that decision.
msg366129 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-10 14:16
Well, this large problem sounds very complex and is made of multiple small issues. Let's start with something simple: revert my commit 066e5b1a917ec2134e8997d2cadd815724314252.

When I wrote it, I expected that nobody was spawning daemon threads in subinterpreters. I was wrong.

Once PR 19456 will be merged (allow again to spawn daemon threads in subinterpreters), we can start to investigate every single issue listed by Graham, like atexit callbacks which are called too late.
msg366270 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-12 21:45
New changeset 14d5331eb5e6c38be12bad421bd59ad0fac9e448 by Victor Stinner in branch 'master':
bpo-40234: Revert "bpo-37266: Daemon threads are now denied in subinterpreters (GH-14049)" (GH-19456)
https://github.com/python/cpython/commit/14d5331eb5e6c38be12bad421bd59ad0fac9e448
msg366271 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-04-12 21:49
> So if you are going to eliminate daemon threads (even if only in sub interpreters at this point), you are going to have to introduce a way to register something similar to an atexit callback which would be invoked before waiting on non daemon threads, so an attempt can be made to notify them that they need to shutdown.

I'm not 100% certain if it fully covers the above use cases, but I recently added a fairly minimal internal `threading._register_atexit()` function that works similar to atexit: it schedules the callbacks to be invoked just before all non-daemon threads are joined, rather than upon interpreter finalization. The primary use case was to remove the reliance of daemon threads in concurrent.futures, for ThreadPoolExecutor and ProcessPoolExecutor. Their management threads relied fully upon daemon threads and `atexit.register()`, but the above provided an alternative means to accomplish the same goal; and also has the benefit of a more predictable shutdown process. See https://github.com/python/cpython/pull/19149 for details.

Should we consider making `threading._register_atexit()` public, and if we did, would it provide a reasonable transition for users that were previously relying upon daemon threads if they were to be deprecated in subinterpreters? If it were to be made public, there may also be a need to be able to unregister the threading atexit callbacks, similar to `atexit.unregister()` (that should be rather straightforward though).

(I added Antoine Pitrou to the nosy list, to see if he has an opinion on any of this as the maintainer of the threading module.)
msg366277 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-13 00:22
Victor (me!):
> Well, this large problem sounds very complex and is made of multiple small issues.

Kyle:
> (...) Should we consider making `threading._register_atexit()` public?

It seems like there is an use case for calling callbacks before Python calls threading._shutdown() in Py_Finalize(). We should either change when atexit callbacks are called, or add something new.

I suggest to open a separated issue for that.
msg368912 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-05-15 01:39
Issue title: "[subinterpreters] Disallow daemon threads in subinterpreters optionally"

This issue is basically addressed by bpo-40453: daemon threads continue to be allowed by default when using Py_NewInterpreter(), but it's possible to opt-in for "isolated" subinterpreters where threads are denied.
History
Date User Action Args
2020-05-15 01:39:13vstinnersetmessages: + msg368912
2020-05-15 00:39:46vstinnersetcomponents: + Subinterpreters, - Interpreter Core
title: Disallow daemon threads in subinterpreters optionally. -> [subinterpreters] Disallow daemon threads in subinterpreters optionally
2020-04-29 18:33:15jceasetnosy: + jcea
2020-04-13 00:22:54vstinnersetmessages: + msg366277
2020-04-12 21:49:51aerossetnosy: + pitrou, aeros
messages: + msg366271
2020-04-12 21:45:20vstinnersetmessages: + msg366270
2020-04-10 14:16:31vstinnersetmessages: + msg366129
2020-04-10 14:14:17vstinnersetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request18810
2020-04-09 22:38:37grahamdsetmessages: + msg366089
2020-04-09 16:55:44vstinnersetmessages: + msg366080
2020-04-09 16:52:44vstinnersetmessages: + msg366079
2020-04-08 23:16:52eric.snowcreate