classification
Title: threading.enumerate(): reentrant call during a GC collection hangs
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: miss-islington, vstinner
Priority: normal Keywords: patch

Created on 2021-06-14 21:26 by vstinner, last changed 2021-06-16 09:44 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
rec_threading.py vstinner, 2021-06-14 21:26
Pull Requests
URL Status Linked Edit
PR 26727 merged vstinner, 2021-06-14 21:29
PR 26737 merged miss-islington, 2021-06-15 14:14
PR 26738 merged miss-islington, 2021-06-15 14:14
PR 26741 merged vstinner, 2021-06-15 16:37
Messages (8)
msg395851 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-06-14 21:26
The threading.enumerate() code is simple:
---
# Active thread administration
_active_limbo_lock = _allocate_lock()
_active = {}    # maps thread id to Thread object
_limbo = {}

def enumerate():
    with _active_limbo_lock:
        return list(_active.values()) + list(_limbo.values())
---

values(), list() and list+list operations can call trigger a GC collection depending on the GC thresholds, created Python objects, etc.

During a GC collection, a Python object destructor (__del__) can again call threading.enumerate().

Problem: _active_limbo_lock is not re-entrant lock and so the second call hangs.

In practice, this issue was seen in OpenStack which uses eventlet, when a destructor uses the logging module. The logging module uses threading.current_thread(), but this function is monkey-patched by the eventlet module.

Extract of the eventlet function:
---
def current_thread():
    ...
    found = [th for th in __patched_enumerate() if th.ident == g_id]
    ...
---
https://github.com/eventlet/eventlet/blob/master/eventlet/green/threading.py

Attached PR makes _active_limbo_lock re-entrant to avoid this issue. I'm not sure if it's ok or not, I would appreciate to get a review and your feedback ;-)

Attached file triggers the threading.enumerate() hang on re-entrant call.
msg395852 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-06-14 21:53
There are different ways to fix this issue:

* (A) Rewrite threading.enumerate() in C with code which cannot trigger a GC collection
* (B) Disable temporarily the GC
* (C) Use a reentrant lock (PR 26727)

(A) The problem is that functions other than threading.enumerate() 
also rely on this lock, like threading.active_count(). I would prefer to not have to rewrite "half" of threading.py in C. Using a RLock is less intrusive.

(B) This is a simple and reliable option. But gc.disable() is process-wide: it affects all Python threads, and so it might have surprising side effects. Some code might rely on the current exact GC behavior. I would prefer to not disable the GC temporarily.
msg395880 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-06-15 14:14
New changeset 243fd01047ddce1a7eb0f99a49732d123e942c63 by Victor Stinner in branch 'main':
bpo-44422: Fix threading.enumerate() reentrant call (GH-26727)
https://github.com/python/cpython/commit/243fd01047ddce1a7eb0f99a49732d123e942c63
msg395881 - (view) Author: miss-islington (miss-islington) Date: 2021-06-15 14:34
New changeset c3b776f83b4c765da6d7b8854e844e07bd33c375 by Miss Islington (bot) in branch '3.10':
bpo-44422: Fix threading.enumerate() reentrant call (GH-26727)
https://github.com/python/cpython/commit/c3b776f83b4c765da6d7b8854e844e07bd33c375
msg395883 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-06-15 16:30
New changeset 8fe57aacc7bf9d9af84803b69dbb1d66597934c6 by Miss Islington (bot) in branch '3.9':
bpo-44422: Fix threading.enumerate() reentrant call (GH-26727) (GH-26738)
https://github.com/python/cpython/commit/8fe57aacc7bf9d9af84803b69dbb1d66597934c6
msg395884 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-06-15 16:31
Ok, the deadlock on reentrant call to threading.enumerate() is now fixed in 3.9, 3.10 and main (future 3.11) branches. I close the issue. Thanks for the reviews, as usual :-)
msg395885 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-06-15 16:43
Oh, I reopen the issue for PR 26741.
msg395914 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-06-16 09:41
New changeset 0729694246174a5c2f0ae197f2e0dbea61b90c9f by Victor Stinner in branch 'main':
bpo-44422: threading.Thread reuses the _delete() method (GH-26741)
https://github.com/python/cpython/commit/0729694246174a5c2f0ae197f2e0dbea61b90c9f
History
Date User Action Args
2021-06-16 09:44:01vstinnersetstatus: open -> closed
resolution: fixed
2021-06-16 09:41:39vstinnersetmessages: + msg395914
2021-06-15 16:43:07vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg395885
2021-06-15 16:37:34vstinnersetpull_requests: + pull_request25327
2021-06-15 16:31:27vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg395884

stage: patch review -> resolved
2021-06-15 16:30:35vstinnersetmessages: + msg395883
2021-06-15 14:34:50miss-islingtonsetmessages: + msg395881
2021-06-15 14:14:39miss-islingtonsetpull_requests: + pull_request25324
2021-06-15 14:14:33miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request25323
2021-06-15 14:14:31vstinnersetmessages: + msg395880
2021-06-14 21:53:33vstinnersetmessages: + msg395852
2021-06-14 21:29:15vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request25317
2021-06-14 21:26:28vstinnercreate