Title: fix for bpo-36402 (threading._shutdown() race condition) causes reference leak
Type: resource usage Stage: patch review
Components: Library (Lib) Versions: Python 3.9, Python 3.8, Python 3.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: anselm.kruis, koobs, vstinner
Priority: normal Keywords: patch

Created on 2019-08-07 16:22 by anselm.kruis, last changed 2019-08-12 15:24 by vstinner.

File name Uploaded Description Edit
threading-leak-test-case.diff anselm.kruis, 2019-08-07 16:22
Pull Requests
URL Status Linked Edit
PR 15175 open anselm.kruis, 2019-08-08 08:23
PR 15228 open vstinner, 2019-08-12 15:24
Messages (2)
msg349173 - (view) Author: Anselm Kruis (anselm.kruis) * Date: 2019-08-07 16:22
Starting with commit 468e5fec (bpo-36402: Fix threading._shutdown() race condition (GH-13948)) the following trivial test case leaks one reference and one memory block.

class MiscTestCase(unittest.TestCase):
    def test_without_join(self):
        # Test that a thread without join does not leak references.
        # Use a debug build and run "python -m test -R: test_threading"

Attached is a patch, that adds this test case to Lib/test/ After you apply this patch "python -m test -R: test_threading" leaks one (additional) reference. This leak is also present in Python 3.7.4 and 3.8.

I'm not sure, if it correct not to join a thread, but it did work flawlessly and didn't leak in previous releases.
I didn't analyse the root cause yet.
msg349225 - (view) Author: Anselm Kruis (anselm.kruis) * Date: 2019-08-08 08:47
The root cause for the reference leak is the global set threading._shutdown_locks. It contains Thread._tstate_lock locks of non-daemon threads. If a non-daemon thread terminates and no other thread joins the terminated thread, the _tstate_lock remains in threading._shutdown_locks forever.

I could imagine that a long running server could accumulate many locks in threading._shutdown_locks over time. Therefore the leak should be fixed.

There are probably several ways to deal with this issue. A straight forward approach is to discard the lock from within `tstate->on_delete` hook, that is function "void release_sentinel(void *)" in _threadmodule.c. Pull request (GH-15175) implements this idea. Eventually I should add another C-Python specific test-case to the PR.
Date User Action Args
2019-08-12 15:24:20vstinnersetpull_requests: + pull_request14952
2019-08-09 12:31:44koobssetnosy: + koobs
2019-08-08 08:48:42anselm.kruissetnosy: + vstinner
2019-08-08 08:47:16anselm.kruissetmessages: + msg349225
2019-08-08 08:23:10anselm.kruissetstage: patch review
pull_requests: + pull_request14906
2019-08-07 16:22:52anselm.kruiscreate