Author grahamd
Recipients eric.snow, grahamd, vstinner
Date 2020-04-09.22:38:36
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1586471917.87.0.183581981457.issue40234@roundup.psfhosted.org>
In-reply-to
Content
Just to make few things clear. It isn't mod_wsgi itself that relies on daemon threads, it is going to be users WSGI applications (or the things they need) that do.

As a concrete example of things that would stop working are monitoring systems such as New Relic, DataDog, Elastic APM etc. These all fire off a background thread to handle aggregation of data collected from the application, with that data then being sent off once a minute to the backend servers.

It isn't just these though. Over the years have see many instances of people using background threads to off load small tasks to be done in process rather than using full blown queuing system such as Celery etc. So I don't believe it is a rare use case. Monitoring systems are a big use case though.

These would all usually use a daemon thread so they can be started and effectively forgotten, with no need to do anything to shut them down when the process is exiting.

Some (such as New Relic, which I wrote so know how it works), will register an atexit callback in order to flush data out before a process stops, but it may not actually exit the thread. Even if it does exit the thread, you can't just switch it to use a non daemon thread as that will not work.

The problem here is that atexit callbacks are only called after the (sub)interpreter shutdown code has waited on non daemon threads. Thus there is no current standard way I know of to notify a non daemon thread to shutdown. The result would be that if these were switched to non daemon thread, the process would hang on shutdown at the point of waiting for non daemon threads.

So if you are going to eliminate daemon threads (even if only in sub interpreters at this point), you are going to have to introduce a way to register something similar to an atexit callback which would be invoked before waiting on non daemon threads, so an attempt can be made to notify them that they need to shutdown. Use of this mechanism is going to have to be added to any code out there currently using daemon threads if they are going to be forced to use non daemon threads. This includes stuff in the stdlib such as the multiprocessing thread pools. They can't just switch to non daemon threads, they have to add the capability to register and be notified of (sub)interpreter shutdown so they can exit the thread else process hangs will occur.

Now a few other things about history and usage of mod_wsgi to give context.

Once upon a time mod_wsgi did try and delete sub interpreters and replace them in the life of a process. This as you can probably imagine now was very buggy because of issues in CPython sub interpreter support. As a result mod_wsgi discarded that ability and so a sub interpreter always persisted and was used for the life of the process. That way problems with clean up of sub interpreters wasn't a big issue.

During cleanup of (sub)interpreters on process shutdown, although crashes could sometimes occur (usually quite rare), what usually happened was that a Python exception would occur. The reason for this would be in cleaning up a (sub)interpreter, sys.modules was cleared up with everything appearing to be set to None. You would therefore get a Python exception because some code trying to access a class instance found the instance replaced by None and so it failed. Even this was rare and not a big deal.

Now although a crash or Python exception could in rare cases occur, for mod_wsgi it didn't really matter since we were talking about sub process of the Apache master process, and the master process didn't care. If Apache was stopping anyway, it just stopped normally. If Apache was doing a restart and child process were told to stop because of that, or if a maximum request threshold was reach and so process was being recycled, then Apache was going to replace the process anyway, so everything just carried on normally and a new process started in its place.

In the case where a process lockup managed to occur on process shutdown, for example if non daemon thread were used explicitly, then process shutdown timeouts applied by mod_wsgi on daemon processes would kick in and the process would be force killed anyway. So all up it was quite resilient and kept working. If embedded mode of mod_wsgi was used, it would though lock up the Apache process indefinitely if something used non daemon threads explicitly.

On the issue of non daemon threads, usually these would never arise. This is because usually people don't explicitly say a thread is non daemon. Where nothing is done to say that, a thread actually inherits the mode of the thread it was created in. Since all request handler threads in mod_wsgi are actually externally created threads which call into Python, they get assigned the DummyThread object to track them. These are treated as non daemon threads. As a result any new threads created which don't explicitly say they are non daemon, get marked as daemon threads anyway. The consequence of this is that they never get waited upon on shutdown and everything works.

Anyway, going forward, if use of daemon threads is blocked in sub interpreters to satisfy the new envisioned use case for them, of using them to run sub tasks out of another interpreter, then first thing would be that mod_wsgi would deprecate use of sub interpreters, disabling them by default and requiring people to explicitly enable ability to use them. This would be just an interim measure in a transition period.

In a followup version mod_wsgi would then discard support for sub interpreters, as well as likely also disable embedded mode (except for specific case of Apache access control hooks, would mean Windows support would still be dropped though). Thus people would be forced to use daemon mode of mod_wsgi where separate processes are used to the Apache child processes to run a WSGI application. This is actually was mod_wsgi-express effectively enforces already. If separation were need for separate applications, or if a single application needed to be split, separate groups of daemon processes would be used with requests redirected to the appropriate instance in a daemon process group.

This use of daemon processes, recommendation to not use sub interpreters and use the main interpreter context has existed for many years due to many third party packages not working in sub interpreters anyway due to simplified GIL API restrictions. So this isn't new, it just hasn't been required and enforced (except in mod_wsgi-express). So these steps just force people in the direction which has been recommended for a long time.

As to the question of why disable/discard sub interpreter support in mod_wsgi, that comes down to support burden. This is not support burden in mod_wsgi, but the effort it will take to deal with all those people out there whose applications will stop working if run in sub interpreters were daemon thread usage prevented. I don't have time to be hand holding all these people and educate or help them to fix their applications or tell them how some third party package they use needs to be changed. Will be easier and less impact on me as the only person who supports mod_wsgi to discard sub interpreter support and document how people need to move to use of daemon mode and main interpreter as has been recommended for a long time anyway but which couldn't be made the default purely because of history of how mod_wsgi was developed and features added over time.

Now later on if someone decides to eliminate daemon threads for the main interpreter context, or changes stdlib so everything uses non daemon threads, then at that point I would stop supporting mod_wsgi in those Python versions. I just feel that is going to have a huge impact on user code at that point and create lots of problems so don't even want to go there. The impact of dropping daemon threads from the main interpreter will likely have affects way beyond mod_wsgi as well, so right now I can't see how you could even make that decision.
History
Date User Action Args
2020-04-09 22:38:37grahamdsetrecipients: + grahamd, vstinner, eric.snow
2020-04-09 22:38:37grahamdsetmessageid: <1586471917.87.0.183581981457.issue40234@roundup.psfhosted.org>
2020-04-09 22:38:37grahamdlinkissue40234 messages
2020-04-09 22:38:36grahamdcreate