New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling pending calls during runtime finalization may cause problems. #81308
Comments
In Python/lifecycle.c (Py_FinalizeEx) we call _Py_FinishPendingCalls(), right after we stop all non-daemon Python threads but before we've actually started finalizing the runtime state. That call looks for any remaining pending calls (for the main interpreter) and runs them. There's some evidence of a bug there. In bpo-33608 I moved the pending calls to per-interpreter state. We saw failures (sometimes sporadic) on a few buildbots (e.g. FreeBSD) during runtime finalization. However, nearly all of the buildbots were fine, so it may be a question of architecture or slow hardware. See bpo-33608 for details on the failures. There are a number of possibilities, but it's been tricky reproducing the problem in order to investigate. Here are some theories:
Here are some other points to consider:
|
Also, someone did manage to investigate and identify a likely cause: |
In the master branch of Python, trip_signal() calls _PyEval_AddPendingCall(tstate) and tstate is get using _PyRuntimeState_GetThreadState(runtime). trip_signal() can be called while the GIL is not held: tstate is NULL in this case. For example, it's the case when calling signal.raise_signal(). Problem: _PyEval_AddPendingCall() uses tstate if pending->finishing is non-zero. if (pending->finishing) {
...
_PyErr_Fetch(tstate, &exc, &val, &tb);
_PyErr_SetString(tstate, PyExc_SystemError,
"Py_AddPendingCall: cannot add pending calls "
"(Python shutting down)");
...
} pending->finishing was addd in bpo-33608 by the change: commit 842a2f0
I found this issue while trying to make pending calls per interpreter in bpo-39984: move pending from _PyRuntimeState to PyInterpreterState. |
That was a crash in multiprocessing tests. PyEval_RestoreThread(tstate) was called with a freed PyThreadState: tstate->interp = 0xdbdbdbdbdbdbdbdb for example. I recently fixed PyEval_RestoreThread() in bpo-39877 to exit properly daemon theads without dereferencing tstate which points to freed memory. To reproduce the bug, it helped me to add a sleep at Python exit. Fixed value (ex: 1 second) or better: random sleep (between 1 ms and 1 sec). |
Since this change, Py_AddPendingCall() now requires the thread to have a Python thread state: it is used if pending->finishing is non-zero. The function documentation says: "This function doesn’t need a current thread state to run, and it doesn’t need the global interpreter lock." https://docs.python.org/dev/c-api/init.html#c.Py_AddPendingCall The current implementation seems to contradict the documentation. We may update the documentation to replace "doesn't need" with "requires" (a Python thread state). Or the implementation can use PyGILState_Ensure() and PyGILState_Release() to create a temporary Python thread state if the thread doesn't have one. Removing pending->finishing would only partially fix the issue. With PR 19061, tstate is required to access "ceval" structure. trip_signal() called by the C signal handler has a similar issue: it only requires non-NULL tstate if pending->finishing is non-zero. |
trip_signal() has another problem. If pending->finishing is non-zero, _PyEval_AddPendingCall() uses the C API whereas the current thread may not hold the GIL. That's forbidden by the C API. The more I think about it, the more I think that pending->finishing is fragile and should be removed. I understood that pending->finishing was added to workaround crashes during Python finalization. I fixed multiple crashes related to daemon threads during Python finalization in bpo-39877. Maybe the bugs that I fixed were the ones which pending->finishing was supposed to work around. |
My notes on Python finalization: https://pythondev.readthedocs.io/finalization.html |
See also bpo-40082: trip_signal() gets NULL tstate on Windows on CTRL+C. |
I removed _pending_calls.finishing for multiple reasons:
I rewrote trip_signal() to only use the PyInterpreterState ("interp") and avoid PyThreadState ("tstate") in PR 19441 to fix bpo-40082. trip_signal() should read and set atomtic variables: don't modify globals without a lock. _PyEval_AddPendingCall() is not fully async-signal safe yet :-/ Using a lock is unsafe in a signal handler. |
I made many changes in Python internals since bpo-33608 was reported. I am not aware of any recent issue with pending calls and so close the issue. If you get new crashes/hangs with pending calls during Python finalization, please open a new issue. |
It can require the GIL, even though it isn't supposed to. This is fixed in 3.9 and was not an issue in 3.7. The symptom is "Fatal Python error: Python memory allocator called without holding the GIL" See python/cpython#81308 Unfortunately, the only workaround I see requires deep magic and duplicating CPython internals.
Bumps [greenlet](https://github.com/python-greenlet/greenlet) from 1.1.3.post0 to 2.0.1. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/python-greenlet/greenlet/blob/master/CHANGES.rst">greenlet's changelog</a>.</em></p> <blockquote> <h1>2.0.1 (2022-11-07)</h1> <ul> <li>Python 3.11: Fix a memory leak. See <code>issue 328 <https://github.com/python-greenlet/greenlet/issues/328></code>_ and <code>gevent issue 1924 <https://github.com/gevent/gevent/issues/1924></code>_.</li> </ul> <h1>2.0.0.post0 (2022-11-03)</h1> <ul> <li>Add <code>Programming Language :: Python :: 3.11</code> to the PyPI classifier metadata.</li> </ul> <h1>2.0.0 (2022-10-31)</h1> <ul> <li>Nothing changed yet.</li> </ul> <h1>2.0.0rc5 (2022-10-31)</h1> <ul> <li>Linux: Fix another group of rare crashes that could occur when shutting down an interpeter running multiple threads. See <code>issue 325 <https://github.com/python-greenlet/greenlet/issues/325></code>_.</li> </ul> <h1>2.0.0rc4 (2022-10-30)</h1> <ul> <li>Linux: Fix a rare crash that could occur when shutting down an interpreter running multiple threads, when some of those threads are in greenlets making calls to functions that release the GIL.</li> </ul> <h1>2.0.0rc3 (2022-10-29)</h1> <ul> <li>Python 2: Fix a crash that could occur when raising an old-style instance object.</li> </ul> <h1>2.0.0rc2 (2022-10-28)</h1> <ul> <li>Workaround <code>a CPython 3.8 bug <https://github.com/python/cpython/issues/81308></code>_ that could cause the interpreter to crash during an early phase of shutdown with the message "Fatal Python error: Python memory allocator called without</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/python-greenlet/greenlet/commit/69f6483c41776fb9095c3fd7b02d3e4ede289c78"><code>69f6483</code></a> Preparing release 2.0.1</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/df2154d45b8cfa751de1fc9e6a95e0b09a53becf"><code>df2154d</code></a> Merge pull request <a href="https://github-redirect.dependabot.com/python-greenlet/greenlet/issues/329">#329</a> from python-greenlet/issue328</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/aa6f2515b2fefced69be933952bd1f3319d0de58"><code>aa6f251</code></a> Python 3.11: Fix a memory leak switching to greenlets.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/7389976dc748677dd4fce535b91ef07e4387c1d6"><code>7389976</code></a> Back to development: 2.0.1</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/0216c2ae008cb185292b136e34ba92ae3f40da6b"><code>0216c2a</code></a> Preparing release 2.0.0.post0</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/7e1704ccbf2ade357416b63d95d645120df1f4dc"><code>7e1704c</code></a> .github/workflows/tests.yml: Move to final release of 3.11</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/f18bff43df80c5190760dc2aedfa436a1fae7bed"><code>f18bff4</code></a> setup.py: Add Python 3.11 to the PyPI classifier metadata.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/fc50ac173d0011492c19c87eb619017fdab47fd9"><code>fc50ac1</code></a> Back to development: 2.0.1</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/f9c29d90b1fd33e3ade1272f46e17dfe17d651e6"><code>f9c29d9</code></a> Preparing release 2.0.0</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/4ba1df63dcd40c8056f21117a91bf56d0be8a6eb"><code>4ba1df6</code></a> Back to development: 2.0.0rc6</li> <li>Additional commits viewable in <a href="https://github.com/python-greenlet/greenlet/compare/1.1.3.post0...2.0.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=greenlet&package-manager=pip&previous-version=1.1.3.post0&new-version=2.0.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details>
2.0.1 (2022-11-07) ================== - Python 3.11: Fix a memory leak. See `issue 328 <https://github.com/python-greenlet/greenlet/issues/328>`_ and `gevent issue 1924 <https://github.com/gevent/gevent/issues/1924>`_. 2.0.0.post0 (2022-11-03) ======================== - Add ``Programming Language :: Python :: 3.11`` to the PyPI classifier metadata. 2.0.0 (2022-10-31) ================== - Nothing changed yet. 2.0.0rc5 (2022-10-31) ===================== - Linux: Fix another group of rare crashes that could occur when shutting down an interpeter running multiple threads. See `issue 325 <https://github.com/python-greenlet/greenlet/issues/325>`_. 2.0.0rc4 (2022-10-30) ===================== - Linux: Fix a rare crash that could occur when shutting down an interpreter running multiple threads, when some of those threads are in greenlets making calls to functions that release the GIL. 2.0.0rc3 (2022-10-29) ===================== - Python 2: Fix a crash that could occur when raising an old-style instance object. 2.0.0rc2 (2022-10-28) ===================== - Workaround `a CPython 3.8 bug <https://github.com/python/cpython/issues/81308>`_ that could cause the interpreter to crash during an early phase of shutdown with the message "Fatal Python error: Python memory allocator called without holding the GI." This only impacted CPython 3.8a3 through CPython 3.9a5; the fix is only applied to CPython 3.8 releases (please don't use an early alpha release of CPython 3.9). 2.0.0rc1 (2022-10-27) ===================== - Deal gracefully with greenlet switches that occur while deferred deallocation of objects is happening using CPython's "trash can" mechanism. Previously, if a large nested container held items that switched greenlets during delayed deallocation, and that second greenlet also invoked the trash can, CPython's internal state could become corrupt. This was visible as an assertion error in debug builds. Now, the relevant internal state is saved and restored during greenlet switches. See also `gevent issue 1909 <https://github.com/gevent/gevent/issues/1909>`_. - Rename the C API function ``PyGreenlet_GET_PARENT`` to ``PyGreenlet_GetParent`` for consistency. The old name remains available as a deprecated alias. 2.0.0a2 (2022-03-24) ==================== - Fix a crash on older versions of the Windows C runtime when an unhandled C++ exception was thrown inside a greenlet by another native extension. This is a bug in that extension, and the interpreter will still abort, but at least it does so deliberately. Thanks to Kirill Smelkov. See `PR 286 <https://github.com/python-greenlet/greenlet/pull/286>`_. - Musllinux wheels for aarch64 are now built, tested, and uploaded to PyPI. Thanks to Alexander Piskun. - This version of greenlet is known to compile and pass tests on CPython 3.11.0a6. Earlier 3.11 releases will not work; later releases may or may not work. See `PR 294 <https://github.com/python-greenlet/greenlet/pull/294>`_. Special thanks to Victor Stinner, Brandt Bucher and the CPython developers.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: