classification
Title: fix for bpo-36402 (threading._shutdown() race condition) causes reference leak
Type: resource usage Stage: resolved
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arkady M, ViktorPegy, Zhipeng Xie, anselm.kruis, jaraco, koobs, krypticus, mark.dickinson, martin.panter, miss-islington, pablogsal, pitrou, python-dev, shihai1991, vstinner
Priority: normal Keywords: 3.7regression, patch

Created on 2019-08-07 16:22 by anselm.kruis, last changed 2021-07-30 07:30 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
threading-leak-test-case.diff anselm.kruis, 2019-08-07 16:22
Pull Requests
URL Status Linked Edit
PR 15175 closed anselm.kruis, 2019-08-08 08:23
PR 15228 merged vstinner, 2019-08-12 15:24
PR 15336 closed miss-islington, 2019-08-19 22:37
PR 15337 closed miss-islington, 2019-08-19 22:37
PR 15338 merged vstinner, 2019-08-19 23:28
PR 25148 closed python-dev, 2021-04-02 10:20
PR 25226 closed shihai1991, 2021-04-06 17:38
PR 26103 merged pitrou, 2021-05-13 17:28
PR 26138 merged miss-islington, 2021-05-14 19:37
PR 26142 merged pitrou, 2021-05-15 09:10
PR 27473 closed python-dev, 2021-07-30 07:30
Messages (23)
msg349173 - (view) Author: Anselm Kruis (anselm.kruis) * Date: 2019-08-07 16:22
Starting with commit 468e5fec (bpo-36402: Fix threading._shutdown() race condition (GH-13948)) the following trivial test case leaks one reference and one memory block.

class MiscTestCase(unittest.TestCase):
    def test_without_join(self):
        # Test that a thread without join does not leak references.
        # Use a debug build and run "python -m test -R: test_threading"
        threading.Thread().start()

Attached is a patch, that adds this test case to Lib/test/test_threading.py. After you apply this patch "python -m test -R: test_threading" leaks one (additional) reference. This leak is also present in Python 3.7.4 and 3.8.

I'm not sure, if it correct not to join a thread, but it did work flawlessly and didn't leak in previous releases.
I didn't analyse the root cause yet.
msg349225 - (view) Author: Anselm Kruis (anselm.kruis) * Date: 2019-08-08 08:47
The root cause for the reference leak is the global set threading._shutdown_locks. It contains Thread._tstate_lock locks of non-daemon threads. If a non-daemon thread terminates and no other thread joins the terminated thread, the _tstate_lock remains in threading._shutdown_locks forever.

I could imagine that a long running server could accumulate many locks in threading._shutdown_locks over time. Therefore the leak should be fixed.

There are probably several ways to deal with this issue. A straight forward approach is to discard the lock from within `tstate->on_delete` hook, that is function "void release_sentinel(void *)" in _threadmodule.c. Pull request (GH-15175) implements this idea. Eventually I should add another C-Python specific test-case to the PR.
msg349973 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-19 22:37
New changeset d3dcc92778807ae8f7ebe85178f36a29711cd478 by Victor Stinner in branch 'master':
bpo-37788: Fix a reference leak if a thread is not joined (GH-15228)
https://github.com/python/cpython/commit/d3dcc92778807ae8f7ebe85178f36a29711cd478
msg349977 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-19 23:29
> https://github.com/python/cpython/commit/d3dcc92778807ae8f7ebe85178f36a29711cd478

This change introduced a regression :-(
https://github.com/python/cpython/pull/15228#issuecomment-522792133

I created PR 15338 to revert it
msg349980 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-08-19 23:47
New changeset d11c2c607768fa549b1aed7899edc061b2ebf19f by Victor Stinner in branch 'master':
Revert "bpo-37788: Fix a reference leak if a thread is not joined (GH-15228)" (GH-15338)
https://github.com/python/cpython/commit/d11c2c607768fa549b1aed7899edc061b2ebf19f
msg354460 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-11 15:36
Copy of wmanley's comment:
https://github.com/python/cpython/pull/13948#issuecomment-541076882
"""
This caused a regression for people overriding Thread.join to implement custom thread interruption. See https://stackoverflow.com/questions/50486083/ending-non-daemon-threads-when-shutting-down-an-interactive-python-session
"""
msg354463 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-10-11 15:42
Hummm....is this really a regression? The docs say:

> No other methods (except for the constructor) should be overridden in a subclass. In other words, only override the __init__() and run() methods of this class.

So if someone is overriding join() they are out of contract
msg358641 - (view) Author: Adam (krypticus) Date: 2019-12-18 19:43
I ran into this bug as well, and opened an issue for it (before I saw this issue): https://bugs.python.org/issue39074

Was there a conclusion on the best way to fix this? It seems like the previous __del__ implementation would correct the resource leakage by removing the _tstate_lock from _shutdown_locks.
msg359477 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-06 22:51
I marked bpo-39074 as a duplicate of this issue.
msg359479 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-01-06 22:54
I marked bpo-39201 as a duplicate of this issue.
msg371866 - (view) Author: Arkady (Arkady M) Date: 2020-06-19 09:58
I have reproduced a similar memory leak in the multiprocessing

This is the trace:

913 memory blocks: 33232 Bytes:  File "/usr/lib/python3.7/threading.py", line 890;    self._bootstrap_inner();  File "/usr/lib/python3.7/threading.py", line 914;    self._set_tstate_lock();  File "/usr/lib/python3.7/threading.py", line 904;    self._tstate_lock = _set_sentinel();
msg380391 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2020-11-05 01:33
I marked bpo-42263 as a duplicate of this issue.

This issue is implicated in preventing the desired fix for bpo-37193, where a thread wishes to remove the handle to itself after performing its duty. By removing its own handle, it can never be joined, and thus obviates the most straightforward way to directly remove the handle for a thread within that thread.
msg388606 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-13 11:27
bpo-37193 has been fixed.
msg388609 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2021-03-13 12:00
@Victor: How does the fix for #37193 solve this issue? I think I'm missing something.
msg388611 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2021-03-13 12:13
Re-opening; I believe this issue is still valid.

Here's output on a debug build on current master (commit 7591d9455eb37525c832da3d65e1a7b3e6dbf613) on my machine:

lovelace:cpython mdickinson$ ./python.exe
Python 3.10.0a6+ (remotes/upstream/master:7591d9455e, Mar 13 2021, 12:04:31) [Clang 11.0.0 (clang-1100.0.33.17)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys, threading, time
>>> for _ in range(10):
...     threading.Thread().start()
...     time.sleep(0.1)
...     print(sys.gettotalrefcount())
... 
109901
109905
109907
109909
109911
109913
109915
109917
109919
109921
msg388618 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-13 13:34
Oh, I'm confused. This issue is about the threading module, whereas bpo-37193 is unrelated: it's about the socketserver module.
msg389671 - (view) Author: Zhipeng Xie (Zhipeng Xie) * Date: 2021-03-29 08:02
ping

I also encountered this problem. Is there a fix now?
msg389696 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2021-03-29 12:50
No. The issue remains open.
msg390118 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2021-04-03 08:04
Anselm's pull request PR 15175 looked hopeful to me. I've been using those changes in our builds at work for a while.
msg390817 - (view) Author: hai shi (shihai1991) * (Python triager) Date: 2021-04-12 04:50
I created PR 25226. It's a another way to solve this problem.
Compared to PR 15175, `_set_sentinel()` don't need to receive any params :)
msg393709 - (view) Author: miss-islington (miss-islington) Date: 2021-05-15 09:24
New changeset 71dca6ea73aaf215fafa094512e8c748248c16b0 by Miss Islington (bot) in branch '3.10':
[3.10] bpo-37788: Fix reference leak when Thread is never joined (GH-26103) (GH-26138)
https://github.com/python/cpython/commit/71dca6ea73aaf215fafa094512e8c748248c16b0
msg393710 - (view) Author: miss-islington (miss-islington) Date: 2021-05-15 09:51
New changeset b30b25b26663fb6070b8ed86fe3a20dcb557d05d by Antoine Pitrou in branch '3.9':
[3.9] bpo-37788: Fix reference leak when Thread is never joined (GH-26103) (GH-26142)
https://github.com/python/cpython/commit/b30b25b26663fb6070b8ed86fe3a20dcb557d05d
msg394307 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-25 09:57
Fix in the main branch:

commit c10c2ec7a0e06975e8010c56c9c3270f8ea322ec
Author: Antoine Pitrou <antoine@python.org>
Date:   Fri May 14 21:37:20 2021 +0200

    bpo-37788: Fix reference leak when Thread is never joined (GH-26103)
    
    
    
    When a Thread is not joined after it has stopped, its lock may remain in the _shutdown_locks set until interpreter shutdown.  If many threads are created this way, the _shutdown_locks set could therefore grow endlessly.  To avoid such a situation, purge expired locks each time a new one is added or removed.
History
Date User Action Args
2021-07-30 07:30:27python-devsetpull_requests: + pull_request25992
2021-05-25 09:57:59vstinnersetmessages: + msg394307
2021-05-15 09:53:03pitrousetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-05-15 09:51:24miss-islingtonsetmessages: + msg393710
2021-05-15 09:24:47miss-islingtonsetmessages: + msg393709
2021-05-15 09:10:11pitrousetpull_requests: + pull_request24777
2021-05-15 09:05:00pitrousetversions: - Python 3.8
2021-05-14 19:37:29miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request24772
2021-05-13 17:28:48pitrousetnosy: + pitrou
pull_requests: + pull_request24745
2021-05-13 17:06:08pitrousetversions: + Python 3.11
2021-04-12 04:50:21shihai1991setmessages: + msg390817
2021-04-06 17:38:50shihai1991setnosy: + shihai1991
pull_requests: + pull_request23963
2021-04-03 08:04:50martin.pantersetmessages: + msg390118
2021-04-02 10:20:04python-devsetnosy: + python-dev

pull_requests: + pull_request23895
stage: needs patch -> patch review
2021-03-29 12:50:07jaracosetstage: needs patch
messages: + msg389696
versions: - Python 3.7
2021-03-29 08:02:29Zhipeng Xiesetnosy: + Zhipeng Xie
messages: + msg389671
2021-03-13 13:34:43vstinnersetmessages: + msg388618
2021-03-13 12:13:34mark.dickinsonsetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg388611

stage: resolved -> (no value)
2021-03-13 12:00:29mark.dickinsonsetmessages: + msg388609
2021-03-13 11:27:21vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg388606

stage: patch review -> resolved
2021-03-09 13:03:16mark.dickinsonsetnosy: + mark.dickinson
2021-03-09 13:03:02mark.dickinsonlinkissue43375 superseder
2021-01-28 03:16:24martin.pantersetkeywords: + 3.7regression
2021-01-28 02:56:08martin.panterlinkissue43050 superseder
2020-11-05 01:33:33jaracosetversions: + Python 3.10
2020-11-05 01:33:21jaracosetnosy: + jaraco
messages: + msg380391
2020-11-05 01:28:49jaracolinkissue42263 superseder
2020-06-23 08:58:36ViktorPegysetnosy: + ViktorPegy
2020-06-19 09:58:22Arkady Msetnosy: + Arkady M
messages: + msg371866
2020-01-06 22:54:04vstinnersetmessages: + msg359479
2020-01-06 22:53:50vstinnerlinkissue39201 superseder
2020-01-06 22:51:31vstinnersetmessages: + msg359477
2020-01-06 22:51:08vstinnerlinkissue39074 superseder
2019-12-18 21:55:24martin.pantersetnosy: + martin.panter
2019-12-18 19:43:46krypticussetnosy: + krypticus
messages: + msg358641
2019-10-11 15:42:52pablogsalsetnosy: + pablogsal
messages: + msg354463
2019-10-11 15:36:12vstinnersetmessages: + msg354460
2019-08-19 23:47:10vstinnersetmessages: + msg349980
2019-08-19 23:30:53vstinnerlinkissue37889 superseder
2019-08-19 23:29:05vstinnersetmessages: + msg349977
2019-08-19 23:28:01vstinnersetpull_requests: + pull_request15055
2019-08-19 22:37:45miss-islingtonsetpull_requests: + pull_request15054
2019-08-19 22:37:31miss-islingtonsetpull_requests: + pull_request15053
2019-08-19 22:37:20vstinnersetmessages: + msg349973
2019-08-12 15:24:20vstinnersetpull_requests: + pull_request14952
2019-08-09 12:31:44koobssetnosy: + koobs
2019-08-08 08:48:42anselm.kruissetnosy: + vstinner
2019-08-08 08:47:16anselm.kruissetmessages: + msg349225
2019-08-08 08:23:10anselm.kruissetstage: patch review
pull_requests: + pull_request14906
2019-08-07 16:22:52anselm.kruiscreate