Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear state of threads earlier in Python shutdown #63665

Closed
vstinner opened this issue Oct 31, 2013 · 26 comments
Closed

Clear state of threads earlier in Python shutdown #63665

vstinner opened this issue Oct 31, 2013 · 26 comments
Labels
3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@vstinner
Copy link
Member

BPO 19466
Nosy @pitrou, @vstinner, @serhiy-storchaka, @vajrasky
PRs
  • bpo-19466: Py_Finalize() clears daemon threads earlier #18848
  • bpo-39877: take_gil() now exits the thread if finalizing #18854
  • Files
  • finalize_threads.patch
  • finalize_threads-2.patch
  • finalize_threads-3.patch
  • fix_typo_19466.patch
  • asyncio_gc.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-03-09.22:44:45.664>
    created_at = <Date 2013-10-31.23:39:46.372>
    labels = ['interpreter-core', '3.9']
    title = 'Clear state of threads earlier in Python shutdown'
    updated_at = <Date 2020-03-09.22:44:45.658>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2020-03-09.22:44:45.658>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-03-09.22:44:45.664>
    closer = 'vstinner'
    components = ['Interpreter Core']
    creation = <Date 2013-10-31.23:39:46.372>
    creator = 'vstinner'
    dependencies = []
    files = ['32443', '32500', '32581', '32586', '48961']
    hgrepos = []
    issue_num = 19466
    keywords = ['patch']
    message_count = 26.0
    messages = ['201860', '202155', '202156', '202164', '202165', '202166', '202169', '202667', '202699', '202704', '202705', '202912', '206009', '206028', '206032', '210993', '210998', '211011', '211072', '211149', '211154', '213823', '363651', '363654', '363779', '363781']
    nosy_count = 7.0
    nosy_names = ['pitrou', 'vstinner', 'Arfrever', 'neologix', 'python-dev', 'serhiy.storchaka', 'vajrasky']
    pr_nums = ['18848', '18854']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue19466'
    versions = ['Python 3.9']

    @vstinner
    Copy link
    Member Author

    Each Python thread holds references to objects, in its current frame for example. At Python shutdown, clearing threads state happens very late: the import machinery is already dead, types are finalized, etc. If a thread has an object with a destructor, bad things may happen.

    For example, when open files are destroyed, a warning is emitted. The problem is that you cannot emits warnings because the warnings module has been unloaded, and so you miss warnings. See the issue bpo-19442 for this specific case.

    It is possible to clear the Python threads earlier since Python 3.2, because Python threads will now exit cleanly when they try to acquire the GIL: see PyEval_RestoreThread(). The value of the tstate pointer is used in PyEval_RestoreThread(), but the pointer is not dereferenced (the content of a Python state is not read). So it is even possible to free the memory of the threads state (not only to clear the threads state).

    Attached patch implements destroy the all threads except the current thread, and clear the state of the current thread.

    The patch calls also the garbage collector before flushing stdout and stderr, and disable signal handlers just before PyImport_Cleanup(). The main idea is to reorder functions like this:

    • clear state of all threads to release strong references -> may call destructores
    • force a garbage collector to release more references -> may call destructores
    • flush stdout and stderr -> write pending warnings and any other buffered messages
    • disable signal handler -> only at the end so previous steps can still be interrupted by CTRL+c

    The side effect of the patch is that the destructor of destroyed objects are now called, especially for daemon threads. If you try the warn_shutdown.py script attached to bpo-19442, you now get the warning with the patch! As a result, test_threading.test_4_daemon_threads() is also modified by my patch to ignore warnings :-)

    If I understood correctly, the patch only has an impact on daemon threads, the behaviour of classic threads is unchanged.

    @serhiy-storchaka
    Copy link
    Member

    Is it possible to write a test for this behavior?

    @pitrou
    Copy link
    Member

    pitrou commented Nov 4, 2013

    I'm afraid clearing thread states is a bit too brutal. What if some destructor relies on contents of the thread states (e.g. thread locals)?

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2013

    2013/11/4 Antoine Pitrou <report@bugs.python.org>:

    I'm afraid clearing thread states is a bit too brutal. What if some destructor relies on contents of the thread states (e.g. thread locals)?

    When Py_Finalize() is called, only one Python thread hold the GIL.
    After _Py_Finalizing=tstate is set, no other thread can hold the GIL.
    If another Python tries to lock the GIL, it is "killed" by
    PyEval_RestoreThread().

    Is that correct? If yes, in which thread would the destructor be
    called? Would it read Python thread locals without holding the GIL?

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2013

    2013/11/4 Serhiy Storchaka <report@bugs.python.org>:

    Is it possible to write a test for this behavior?

    It is possible to test it manually using warn_shutdown.py attached to
    bpo-19442. The warnings module may be used to test the change.

    @pitrou
    Copy link
    Member

    pitrou commented Nov 4, 2013

    I am referring to the following part of your patch:

    + /* Clear the state of the current thread */
    + PyThreadState_Clear(tstate);

    You are clearing the thread state of the currently executing thread, which doesn't sound right.

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2013

    You are clearing the thread state of the currently executing
    thread, which doesn't sound right.

    Oh, I didn't realize that PyThreadState_Clear() clears also Python thread locals.

    Here is a new patch without PyThreadState_Clear(tstate) and with two unit tests:

    • ensure that Python thread locals are not destroyed before destructors are called
    • ensure that object destructors are called before Python thread states are destroyed

    @vstinner
    Copy link
    Member Author

    neologix made a review on rietveld:
    http://bugs.python.org/review/19466/#ps9818

    There was an issue in finalize_threads-2.patch: the test didn't pass in release mode. I fixed it by adding -Wd to the command line option.

    I also replaced threading.Lock with threading.Event in the unit test to synchronize the two threads.

    => finalize_threads-3.patch

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 12, 2013

    New changeset c2a13acd5e2b by Victor Stinner in branch 'default':
    Close bpo-19466: Clear the frames of daemon threads earlier during the Python
    http://hg.python.org/cpython/rev/c2a13acd5e2b

    @python-dev python-dev mannequin closed this as completed Nov 12, 2013
    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented Nov 12, 2013

    Typo:

    s/immediatly/immediately.

    I attached the patch.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 12, 2013

    New changeset 10a8e676b87b by Victor Stinner in branch 'default':
    Issue bpo-19466: Fix typo. Patch written by Vajrasky Kok.
    http://hg.python.org/cpython/rev/10a8e676b87b

    @vstinner
    Copy link
    Member Author

    The fix raised a new issue: bpo-19565.

    I also saw a crash on Windows which is probably related:

    http://buildbot.python.org/all/builders/x86 Windows Server 2003 [SB] 3.x/builds/1717/steps/test/logs/stdio

    ======================================================================
    FAIL: test_4_daemon_threads (test.test_threading.ThreadJoinOnShutdown)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "E:\Data\buildslave\cpython\3.x.snakebite-win2k3r2sp2-x86\build\lib\test\test_threading.py", line 783, in test_4_daemon_threads
        rc, out, err = assert_python_ok('-c', script)
      File "E:\Data\buildslave\cpython\3.x.snakebite-win2k3r2sp2-x86\build\lib\test\script_helper.py", line 69, in assert_python_ok
        return _assert_python(True, *args, **env_vars)
      File "E:\Data\buildslave\cpython\3.x.snakebite-win2k3r2sp2-x86\build\lib\test\script_helper.py", line 55, in _assert_python
        "stderr follows:\n%s" % (rc, err.decode('ascii', 'ignore')))
    AssertionError: Process return code is 3221225477, stderr follows:

    @vstinner vstinner reopened this Nov 15, 2013
    @vstinner
    Copy link
    Member Author

    I'm unable to reproduce the test_4_daemon_threads failure, can anyone try on Windows?

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Dec 13, 2013

    Hum...
    Correct me if I'm wrong, but destroying the thread state of daemon threads
    while they're running is really a bad idea in fact: for example, if
    warnings are now emitted for unclosed file objects, this means that the
    file object, and all associated buffers, are destroyed.
    But even though the daemon thread doesn't hold the GIL, he might still be
    using this object.

    For example, if we have:
    daemon thread:
    readinto:
    release_gil
    read(fd, fileobj->buffer, fileobj->size)
    acquire_gil

    and the main thread exits, then fileobj will be deallocated.

    It's actually fairly easy to write a short crasher launching a deamon
    thread reading from e.g. /dev/zero in loop, with the main thread exiting
    right in the middle.

    @vstinner
    Copy link
    Member Author

    "It's actually fairly easy to write a short crasher launching a deamon
    thread reading from e.g. /dev/zero in loop, with the main thread exiting
    right in the middle."

    I'm unable to write such crasher, can you please give an example? Are you able to crash Python on Linux or on Python 3.3?

    Does the threading test makes sense?

    @vstinner
    Copy link
    Member Author

    I didn't see test_threading failing anymore recently, so I'm closing the issue. Thanks.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Feb 11, 2014

    I didn't see test_threading failing anymore recently, so I'm closing the
    issue. Thanks.

    Hm...

    Could someone review my message http://bugs.python.org/msg206028?

    Because AFAICT, this will lead to random crashes with daemon threads.

    @vstinner
    Copy link
    Member Author

    Could someone review my message http://bugs.python.org/msg206028?

    I replied with questions and you didn't reply :-)

    @vstinner
    Copy link
    Member Author

    A new bug was found because of changes of this issue: bpo-20526. IMO it is safer to revert changes of this issue in Python 3.4, it's too late to analyze such tricky bug. See bpo-20526 for the discussion.

    @vstinner vstinner reopened this Feb 12, 2014
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 13, 2014

    New changeset 1166b3321012 by Victor Stinner in branch 'default':
    Issue bpo-20526, bpo-19466: Revert changes of issue bpo-19466 which introduces a
    http://hg.python.org/cpython/rev/1166b3321012

    @vstinner
    Copy link
    Member Author

    I reverted the changes because it introduced regressions.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 17, 2014

    New changeset 9ce58a73b6b5 by Victor Stinner in branch '3.4':
    Issue bpo-20526, bpo-19466: Revert changes of issue bpo-19466 which introduces a
    http://hg.python.org/cpython/rev/9ce58a73b6b5

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 8, 2020

    The change has been reverted in 2014. I reopen the issue since using daemon thraeds became safer in Python 3.9. As soon as a daemon thread attempts to take the GIL, it does exit immediately.

    With the commit eb4e2ae (bpo-39877), it should be even more reliable.

    Let me retry to fix this issue again. I reopen the issue and I wrote a new PR.

    @vstinner vstinner reopened this Mar 8, 2020
    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 8, 2020

    asyncio_gc.py: script to reproduce bpo-20526 (copied from there), but ported to Python 3.9 (replace asyncio.async with asyncio.ensure_future).

    Sadly, Python with PR 18848 does crash when running asyncio_gc.py if I press CTRL+c twice:

    • A thread does crash in _PyObject_GC_TRACK_impl() called indirectly by select_epoll_poll_impl(), in a Python thread
    • while main thread is exiting Python: Py_FinalizeEx() is calling _PyImport_Cleanup() which blocks on take_gil()

    It seems like the main thread does *not* hold the hold, the thread which crashed does.

    It should not happen: _PyRuntimeState_SetFinalizing(runtime, tstate) has been called and so no other thread should be able to take the GIL anymore, but exit immediately.

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 9, 2020

    New changeset 9ad58ac by Victor Stinner in branch 'master':
    bpo-19466: Py_Finalize() clears daemon threads earlier (GH-18848)
    9ad58ac

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 9, 2020

    I merged more changes in bpo-39877 which made possible to finally fix this issue.

    @vstinner vstinner added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.9 only security fixes labels Mar 9, 2020
    @vstinner vstinner closed this as completed Mar 9, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants