Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[subinterpreters] Disallow daemon threads in subinterpreters optionally #84415

Open
ericsnowcurrently opened this issue Apr 8, 2020 · 9 comments
Labels
3.9 only security fixes topic-subinterpreters type-bug An unexpected behavior, bug, or error

Comments

@ericsnowcurrently
Copy link
Member

BPO 40234
Nosy @jcea, @pitrou, @vstinner, @ericsnowcurrently, @aeros
PRs
  • bpo-40234: Revert "bpo-37266: Daemon threads are now denied in subinterpreters (GH-14049)" #19456
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2020-04-08.23:16:52.849>
    labels = ['expert-subinterpreters', 'type-bug', '3.9']
    title = '[subinterpreters] Disallow daemon threads in subinterpreters optionally'
    updated_at = <Date 2020-05-15.01:39:13.472>
    user = 'https://github.com/ericsnowcurrently'

    bugs.python.org fields:

    activity = <Date 2020-05-15.01:39:13.472>
    actor = 'vstinner'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Subinterpreters']
    creation = <Date 2020-04-08.23:16:52.849>
    creator = 'eric.snow'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40234
    keywords = ['patch']
    message_count = 9.0
    messages = ['366029', '366079', '366080', '366089', '366129', '366270', '366271', '366277', '368912']
    nosy_count = 6.0
    nosy_names = ['jcea', 'pitrou', 'vstinner', 'grahamd', 'eric.snow', 'aeros']
    pr_nums = ['19456']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue40234'
    versions = ['Python 3.9']

    @ericsnowcurrently
    Copy link
    Member Author

    In bpo-37266 we strictly disallowed creation of daemon threads in subinterpreters. However, this is backward-incompatible for existing users of the subinterpreter C-API (such as mod-wsgi).

    Rather than reverting that change I suggest that we make it opt-in through the interpreter config. That would preserve backward-compatibility. It would also make it so we can disallow daemon threads in subinterpreters created through PEP-554. We could also deprecate use of daemon threads in *all* subinterpreters, with the goal of dropping support after a while.

    @ericsnowcurrently ericsnowcurrently added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.9 only security fixes type-bug An unexpected behavior, bug, or error labels Apr 8, 2020
    @vstinner
    Copy link
    Member

    vstinner commented Apr 9, 2020

    First of all, I know that mod_wsgi uses subinterpreters, but I didn't know that it uses daemon threads. When I hack Python to better isolate subinterpreters, I keep mod_wsgi in mind ;-)

    My commit 066e5b1 message says: "Daemon threads were never supported in subinterpreters." That's inaccurate. Only the second sentence is correct: "Previously, the subinterpreter finalization crashed with a Pyton fatal error if a daemon thread was still running."

    In Python 3.8, it is possible to use daemon thread but only if they complete before Py_EndInterpreter() is called.

    I'm not comfortable with Python "crashing" (call Py_FatalError() which calls abort() which may generate a coredump) depending if all daemon threads complete or not. I'm trying to make Python finalization more deterministic.

    My concern is this Py_EndInterpreter() check:

        if (tstate != interp->tstate_head || tstate->next != NULL) {
            Py_FatalError("not the last thread");
        }

    This check is as old as subinterpreters, it was added at the same time than the Py_EndInterpreter() function:

    commit 25ce566
    Author: Guido van Rossum <guido@python.org>
    Date: Sat Aug 2 03:10:38 1997 +0000

    The last of the mass checkins for separate (sub)interpreters.
    Everything should now work again.
    
    See the comments for the .h files mass checkin (e.g. pystate.h) for
    more detail.
    

    My concern is that Py_FatalError() exits the whole process, not only the thread calling Py_EndInterpreter(). The caller cannot catch Py_FatalError() call and decide how to handle it :-( That doesn't seem like a good API when Python is embedded in an application.

    Recently, I had to deal with (fix) many bugs related to daemon threads in Python finalization:

    Simply denying daemon threads in subinterpreters is the safest and simplest option. But it's also a backward incompatible change.

    I'm open to allow again daemon threads in subinterpreter in Python 3.9 but emit a DeprecationWarning to announce that this feature is going to be remove "later" (ex: Python 3.11) since it's "broken by design" (in my opinion).

    We could also deprecate use of daemon threads in *all* subinterpreters, with the goal of dropping support after a while.

    It sounds like a good idea to make Python finalization more reliable. But daemon threads are still widely used in the Python standard library. For example, by the multiprocessing module.

    concurrent.futures was only modified recently to stop using daemon threads, after I denied spawned daemon threads in subinterpreters:

    So if someone wants to ban daemon threads, we should start by modifying the stdlib to avoid them, then deprecate the feature, and then we can start talking about removing the feature.

    @vstinner
    Copy link
    Member

    vstinner commented Apr 9, 2020

    Graham Dumpleton started a discussion about bpo-37266 on Twitter:
    https://twitter.com/GrahamDumpleton/status/1246374493267701760
    https://twitter.com/GrahamDumpleton/status/1246373855532216320

    I dislike discussion threads on Twitter. For an unknown reason, I always fail to see all replies. I only see them in Mentions, not in the thread!?

    @grahamd
    Copy link
    Mannequin

    grahamd mannequin commented Apr 9, 2020

    Just to make few things clear. It isn't mod_wsgi itself that relies on daemon threads, it is going to be users WSGI applications (or the things they need) that do.

    As a concrete example of things that would stop working are monitoring systems such as New Relic, DataDog, Elastic APM etc. These all fire off a background thread to handle aggregation of data collected from the application, with that data then being sent off once a minute to the backend servers.

    It isn't just these though. Over the years have see many instances of people using background threads to off load small tasks to be done in process rather than using full blown queuing system such as Celery etc. So I don't believe it is a rare use case. Monitoring systems are a big use case though.

    These would all usually use a daemon thread so they can be started and effectively forgotten, with no need to do anything to shut them down when the process is exiting.

    Some (such as New Relic, which I wrote so know how it works), will register an atexit callback in order to flush data out before a process stops, but it may not actually exit the thread. Even if it does exit the thread, you can't just switch it to use a non daemon thread as that will not work.

    The problem here is that atexit callbacks are only called after the (sub)interpreter shutdown code has waited on non daemon threads. Thus there is no current standard way I know of to notify a non daemon thread to shutdown. The result would be that if these were switched to non daemon thread, the process would hang on shutdown at the point of waiting for non daemon threads.

    So if you are going to eliminate daemon threads (even if only in sub interpreters at this point), you are going to have to introduce a way to register something similar to an atexit callback which would be invoked before waiting on non daemon threads, so an attempt can be made to notify them that they need to shutdown. Use of this mechanism is going to have to be added to any code out there currently using daemon threads if they are going to be forced to use non daemon threads. This includes stuff in the stdlib such as the multiprocessing thread pools. They can't just switch to non daemon threads, they have to add the capability to register and be notified of (sub)interpreter shutdown so they can exit the thread else process hangs will occur.

    Now a few other things about history and usage of mod_wsgi to give context.

    Once upon a time mod_wsgi did try and delete sub interpreters and replace them in the life of a process. This as you can probably imagine now was very buggy because of issues in CPython sub interpreter support. As a result mod_wsgi discarded that ability and so a sub interpreter always persisted and was used for the life of the process. That way problems with clean up of sub interpreters wasn't a big issue.

    During cleanup of (sub)interpreters on process shutdown, although crashes could sometimes occur (usually quite rare), what usually happened was that a Python exception would occur. The reason for this would be in cleaning up a (sub)interpreter, sys.modules was cleared up with everything appearing to be set to None. You would therefore get a Python exception because some code trying to access a class instance found the instance replaced by None and so it failed. Even this was rare and not a big deal.

    Now although a crash or Python exception could in rare cases occur, for mod_wsgi it didn't really matter since we were talking about sub process of the Apache master process, and the master process didn't care. If Apache was stopping anyway, it just stopped normally. If Apache was doing a restart and child process were told to stop because of that, or if a maximum request threshold was reach and so process was being recycled, then Apache was going to replace the process anyway, so everything just carried on normally and a new process started in its place.

    In the case where a process lockup managed to occur on process shutdown, for example if non daemon thread were used explicitly, then process shutdown timeouts applied by mod_wsgi on daemon processes would kick in and the process would be force killed anyway. So all up it was quite resilient and kept working. If embedded mode of mod_wsgi was used, it would though lock up the Apache process indefinitely if something used non daemon threads explicitly.

    On the issue of non daemon threads, usually these would never arise. This is because usually people don't explicitly say a thread is non daemon. Where nothing is done to say that, a thread actually inherits the mode of the thread it was created in. Since all request handler threads in mod_wsgi are actually externally created threads which call into Python, they get assigned the DummyThread object to track them. These are treated as non daemon threads. As a result any new threads created which don't explicitly say they are non daemon, get marked as daemon threads anyway. The consequence of this is that they never get waited upon on shutdown and everything works.

    Anyway, going forward, if use of daemon threads is blocked in sub interpreters to satisfy the new envisioned use case for them, of using them to run sub tasks out of another interpreter, then first thing would be that mod_wsgi would deprecate use of sub interpreters, disabling them by default and requiring people to explicitly enable ability to use them. This would be just an interim measure in a transition period.

    In a followup version mod_wsgi would then discard support for sub interpreters, as well as likely also disable embedded mode (except for specific case of Apache access control hooks, would mean Windows support would still be dropped though). Thus people would be forced to use daemon mode of mod_wsgi where separate processes are used to the Apache child processes to run a WSGI application. This is actually was mod_wsgi-express effectively enforces already. If separation were need for separate applications, or if a single application needed to be split, separate groups of daemon processes would be used with requests redirected to the appropriate instance in a daemon process group.

    This use of daemon processes, recommendation to not use sub interpreters and use the main interpreter context has existed for many years due to many third party packages not working in sub interpreters anyway due to simplified GIL API restrictions. So this isn't new, it just hasn't been required and enforced (except in mod_wsgi-express). So these steps just force people in the direction which has been recommended for a long time.

    As to the question of why disable/discard sub interpreter support in mod_wsgi, that comes down to support burden. This is not support burden in mod_wsgi, but the effort it will take to deal with all those people out there whose applications will stop working if run in sub interpreters were daemon thread usage prevented. I don't have time to be hand holding all these people and educate or help them to fix their applications or tell them how some third party package they use needs to be changed. Will be easier and less impact on me as the only person who supports mod_wsgi to discard sub interpreter support and document how people need to move to use of daemon mode and main interpreter as has been recommended for a long time anyway but which couldn't be made the default purely because of history of how mod_wsgi was developed and features added over time.

    Now later on if someone decides to eliminate daemon threads for the main interpreter context, or changes stdlib so everything uses non daemon threads, then at that point I would stop supporting mod_wsgi in those Python versions. I just feel that is going to have a huge impact on user code at that point and create lots of problems so don't even want to go there. The impact of dropping daemon threads from the main interpreter will likely have affects way beyond mod_wsgi as well, so right now I can't see how you could even make that decision.

    @vstinner
    Copy link
    Member

    Well, this large problem sounds very complex and is made of multiple small issues. Let's start with something simple: revert my commit 066e5b1.

    When I wrote it, I expected that nobody was spawning daemon threads in subinterpreters. I was wrong.

    Once PR 19456 will be merged (allow again to spawn daemon threads in subinterpreters), we can start to investigate every single issue listed by Graham, like atexit callbacks which are called too late.

    @vstinner
    Copy link
    Member

    New changeset 14d5331 by Victor Stinner in branch 'master':
    bpo-40234: Revert "bpo-37266: Daemon threads are now denied in subinterpreters (GH-14049)" (GH-19456)
    14d5331

    @aeros
    Copy link
    Contributor

    aeros commented Apr 12, 2020

    So if you are going to eliminate daemon threads (even if only in sub interpreters at this point), you are going to have to introduce a way to register something similar to an atexit callback which would be invoked before waiting on non daemon threads, so an attempt can be made to notify them that they need to shutdown.

    I'm not 100% certain if it fully covers the above use cases, but I recently added a fairly minimal internal threading._register_atexit() function that works similar to atexit: it schedules the callbacks to be invoked just before all non-daemon threads are joined, rather than upon interpreter finalization. The primary use case was to remove the reliance of daemon threads in concurrent.futures, for ThreadPoolExecutor and ProcessPoolExecutor. Their management threads relied fully upon daemon threads and atexit.register(), but the above provided an alternative means to accomplish the same goal; and also has the benefit of a more predictable shutdown process. See #19149 for details.

    Should we consider making threading._register_atexit() public, and if we did, would it provide a reasonable transition for users that were previously relying upon daemon threads if they were to be deprecated in subinterpreters? If it were to be made public, there may also be a need to be able to unregister the threading atexit callbacks, similar to atexit.unregister() (that should be rather straightforward though).

    (I added Antoine Pitrou to the nosy list, to see if he has an opinion on any of this as the maintainer of the threading module.)

    @vstinner
    Copy link
    Member

    Victor (me!):

    Well, this large problem sounds very complex and is made of multiple small issues.

    Kyle:

    (...) Should we consider making threading._register_atexit() public?

    It seems like there is an use case for calling callbacks before Python calls threading._shutdown() in Py_Finalize(). We should either change when atexit callbacks are called, or add something new.

    I suggest to open a separated issue for that.

    @vstinner vstinner added topic-subinterpreters and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels May 15, 2020
    @vstinner vstinner changed the title Disallow daemon threads in subinterpreters optionally. [subinterpreters] Disallow daemon threads in subinterpreters optionally May 15, 2020
    @vstinner vstinner added topic-subinterpreters and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels May 15, 2020
    @vstinner vstinner changed the title Disallow daemon threads in subinterpreters optionally. [subinterpreters] Disallow daemon threads in subinterpreters optionally May 15, 2020
    @vstinner
    Copy link
    Member

    Issue title: "[subinterpreters] Disallow daemon threads in subinterpreters optionally"

    This issue is basically addressed by bpo-40453: daemon threads continue to be allowed by default when using Py_NewInterpreter(), but it's possible to opt-in for "isolated" subinterpreters where threads are denied.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes topic-subinterpreters type-bug An unexpected behavior, bug, or error
    Projects
    Status: Todo
    Development

    No branches or pull requests

    3 participants