Issue 1089632: _DummyThread() objects not freed from threading._active map

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/41357

classification

Title:	_DummyThread() objects not freed from threading._active map
Type:		Stage:
Components:	Interpreter Core	Versions:	Python 2.3

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	brett.cannon, saravanand, tim.peters
Priority:	normal	Keywords:

Created on 2004-12-22 10:07 by saravanand, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (9)
msg23793 - (view)	Author: saravanand (saravanand)	Date: 2004-12-22 10:07
Problem Background: =============== I have Python Server module (long running) which accepts calls from several Python Clients over socket interface and forwards the call to a C++ component. This C++ component gives the reponses back to Python Server in a separate thread(created by C++ module) via callback. In the Python Callback implementation, the responses are sent to client in a synchronised manner using Python primitive threading.Semaphore. This Synchronisation is required as the C++ component can deliver parallel responses in different C++ threads. Here, the Python Server creates the semaphore object per client when the client request arrives (in Python thread). This same object is acquired & released in the C++ callback thread(s). Here we observed that Windows Events are getting created whenever the acquire method is executed in the Python Callback implementation in the context of C++ thread. But the same event is not freed by the Python Interpreter even after the termination of the C++ thread. Because of this, a Windows Event handles are getting leaked in the Python Server. Problem Description: ============== When we checked the Python module threading.py, we found that, every time a non-python thread (in our case C++ created thread), enters python and accessesn a primitive in threading module (eg: Semaphore, RLock), python looks for an entry for this thread in the _active map using thread ID as the Key. Since no entry exists for such C++ created threads, a _DummyThread object is created and added to the _active map for this C++ thread. For every _DummyThread object that is created, there is a corresponding Windows Event also getting created. Since this entry is never removed from the _active map even after the termination of the C++ thread ( as we could make out from the code in threading.py),for every "unique" C++ thread that enters python, a Windows Event is allocated and this manifests as continuous increase in the Handle count in my Python server ( as seen in Windows PerfMon/Task Manager). Is there a way to avoid this caching in Python Interpreter? Why cant Python remove this entry from the map when the C++ thread terminates. Or if Python can't get to know about the thread termination, should it not implement some kind of Garbage collection for the entries in this Map (especially entries for the _DummyThread objects). Does this require a correction in Python modulethreading.py? or is this caching behaviour by design?
msg23794 - (view)	Author: Brett Cannon (brett.cannon) *	Date: 2004-12-24 02:35
Logged In: YES user_id=357491 Yes, it is by design. If you read the source you will notice that the comment mentions that the _DummyThread object is flagged as a daemon thread and thus should not be expected to be killed. The comment also mentions how they are not garbage collected. As stated in the docs, dummy threads are of limited functionality. You could cheat and remove the entries yourself from threading._active, but that might not be future-safe. I would just make sure that all threads are created through the threading or thread module, even if it means creating a minimal wrapper in Python for your C++ code to call through that to execute your C++ threads. If you want the docs to be more specific please feel free to submit a patch for the docs. Or if you can come up with a good way for the dummy threads to clean up after themselves then you can also submit that. But since the source code specifies that this expected and the docs say that dummy threads are of limited functionality I am closing as "won't fix".
msg23795 - (view)	Author: saravanand (saravanand)	Date: 2005-01-05 10:29
Logged In: YES user_id=1181691 asdfas
msg23796 - (view)	Author: saravanand (saravanand)	Date: 2005-01-05 10:54
Logged In: YES user_id=1181691 I tried the following workaround which is working (causes no handle leaks) Workaround is to change threading.semaphore to Windows Extension module APIs win32event.CreateMutex(), win32event.WaitForSingleObject and win32event.ReleaseMutex () After this change, there are no handle leaks. So, my questions are: 1) Is this workaround OK or are there any other issues regarding the win32api usage ? 2) you suggested to create minimal python wrappers for C++ code and call C++ from python (instead of C++ thread callbacks). So I would like to know, in general, whether it is a bad idea for c++ threads to callback into Python. If yes, what are the issues (apart from the handle leak mentioned before). If no, I would like to live with the above workaround. Thanks in advance
msg23797 - (view)	Author: Brett Cannon (brett.cannon) *	Date: 2005-01-06 07:19
Logged In: YES user_id=357491 What does Semaphore have to do with _DummyThread and currentThread? And as for win32api, that is not part of the Python stdlib and thus I have no way of commenting on that. For C++ threads to call into Python code, I would not say that is bad specifically. You were just trying to get thread info for non-Python threads and that was leading to _DummyThread instances to be created as designed and you just didn't want that. Calling Python code is fine, just don't expect it to know about your C++ threads as much as you seem to want it to. And please leave this bug closed. If free to submit a patch to change semantics, but that does not affect this bug report.
msg23798 - (view)	Author: Tim Peters (tim.peters) *	Date: 2005-01-06 17:16
Logged In: YES user_id=31435 Presumably, by changing threading.Semaphore to stop using any code from threading.py, then threading.currentThread() never gets called and so a _DummyThread is never created then. I expect the reason a _DummyThread causes Event leaks is just that Thread.__init__ always ends up allocating a Python lock (Thread.__block), which allocates a Windows Event under the covers. It could be that Thread.__block is never actually used for dummy threads, in which case we could avoid allocating it in that case (or could get rid of it right way in _DummyThread.__init__). The dummy thread would still clog the _active dict, but wouldn't leak Events then. It's certainly true that Python has no way to know when a thread it didn't start goes away.
msg23799 - (view)	Author: Brett Cannon (brett.cannon) *	Date: 2005-01-07 03:36
Logged In: YES user_id=357491 Ah, I didn't notice that Semaphore uses a Condition lock which uses an RLock which calls currentThread before and gets a lock from thread.allocate_lock which probably uses an Event on Windows. I also noticed that if __debug__ is set than the _note method uses it as well. It looks like Thread.__block can't be called for a _DummyThread since the only places self.__block is used is in Thread.__stop (which is called in Thread.__bootstrap which is called by Thread.start which will raise an AssertionError since Thread.__started will be set to True thanks to _DummyThread.__init__) and in Thread.join which is overridden in _DummyThread. So it looks like deleting the key should be safe in _DummyThread.__init__. Probably wouldn't hurt to delete self.__stderr while we are at it since it never gets used either and thus is basically a ref leak. Sound good to you, Tim?
msg23800 - (view)	Author: Tim Peters (tim.peters) *	Date: 2005-01-07 04:15
Logged In: YES user_id=31435 Right, thread.allocate_lock() does allocate an Event on Windows. On other platforms it allocates other kinds of limited resource (one way or another, it has to grab some kind of locking object from the operating system, and the supply of those is typically finite). Looking it over, I agree Thread.__block won't be used by a dummy thread. I also <wink> note that the use of "assert False" in _DummyThread.join() is plain bizarre. It's a user error to try to join a dummy thread, not an internal invariant we believe cannot occur. Guido misused (IMO) `assert` in several places in this module, but its use in join() is way over the edge. Anyway, ya, I think it would be good to change this. I don't really understand the point to nuking Thread.__stderr -- typically, sys.stderr is a single object over the life of a program, and doesn't go away until the program ends. That is, I doubt the reference is keeping anything alive that would have gone away otherwise.
msg23801 - (view)	Author: Brett Cannon (brett.cannon) *	Date: 2005-01-08 02:47
Logged In: YES user_id=357491 OK, rev. 1.46 for 2.5 has the fix (now this bug is closed). I can backport to 2.4 and 2.3, but it technically changes the interface and it is not a huge issue so I am not going to bother to backport unless people really feel it is necessary. As for the assert use, yeah, that is odd. =) But that whole module could use a rewrite. Probably a good thing to do for Python 3. The reason I suggested the self.__stderr deletion is because there is a chance that someone set sys.stderr to a file or some other object and not to sys._stderr. Thought it would be a more thorough fix, but not a big thing to me; just happened to notice it.

History
Date	User	Action	Args
2022-04-11 14:56:08	admin	set	github: 41357
2004-12-22 10:07:52	saravanand	create