This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: _DummyThread() objects not freed from threading._active map
Type: Stage:
Components: Interpreter Core Versions: Python 2.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, saravanand, tim.peters
Priority: normal Keywords:

Created on 2004-12-22 10:07 by saravanand, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (9)
msg23793 - (view) Author: saravanand (saravanand) Date: 2004-12-22 10:07
Problem Background:
===============

I have Python Server module (long running) which 
accepts calls from several Python Clients over socket 
interface and forwards the call to a C++ component.  
This C++ component gives the reponses back to Python 
Server in a separate thread(created by C++ module) via 
callback.

In the Python Callback implementation, the responses 
are sent to client in a synchronised manner using Python 
primitive threading.Semaphore.  This Synchronisation is 
required as the C++ component can deliver parallel 
responses in different C++ threads.

Here, the Python Server creates the semaphore object 
per client when the client request arrives (in Python 
thread).  This same object is acquired & released in the 
C++ callback thread(s).

Here we observed that Windows Events are getting 
created whenever the acquire method is executed in the 
Python Callback implementation in the context of C++ 
thread. But the same event is not freed by the Python 
Interpreter even after the termination of the C++ 
thread.   Because of this, a Windows Event handles are 
getting leaked in the Python Server.  

Problem Description:
==============
When we checked the Python module threading.py, we 
found that, every time a non-python thread (in our case 
C++ created thread), enters python and accessesn a 
primitive in threading module (eg: Semaphore, RLock), 
python looks for an entry for this thread in the _active 
map using thread ID as the Key. Since no  entry exists 
for such C++ created threads, a _DummyThread object 
is created and added to the _active map for this C++ 
thread. 

For every _DummyThread object that is created, there is 
a corresponding Windows Event also getting created.

Since this entry is never removed from the _active map 
even after the termination of the C++ thread ( as we 
could make out from the code in threading.py),for 
every "unique" C++ thread that enters python, a 
Windows Event is allocated and this manifests as 
continuous increase in the Handle count in my Python 
server ( as seen in Windows PerfMon/Task Manager).

Is there a way to avoid this caching in Python 
Interpreter? Why cant Python remove this entry from 
the map when the C++ thread terminates. Or if Python 
can't get to know about the thread termination, should 
it not implement some kind of Garbage collection for the 
entries in this Map (especially entries for the 
_DummyThread objects).

Does this require a correction in Python 
modulethreading.py?

or is this caching behaviour by design?
msg23794 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2004-12-24 02:35
Logged In: YES 
user_id=357491

Yes, it is by design.  If you read the source you will notice that the 
comment mentions that the _DummyThread object is flagged as a 
daemon thread and thus should not be expected to be killed.  The 
comment also mentions how they are not garbage collected.  As stated in 
the docs, dummy threads are of limited functionality.

You could cheat and remove the entries yourself from threading._active, 
but that might not be future-safe.  I would just make sure that all 
threads are created through the threading or thread module, even if it 
means creating a minimal wrapper in Python for your C++ code to call 
through that to execute your C++ threads.

If you want the docs to be more specific please feel free to submit a 
patch for the docs.  Or if you can come up with a good way for the 
dummy threads to clean up after themselves then you can also submit 
that.

But since the source code specifies that this expected and the docs say 
that dummy threads are of limited functionality I am closing as "won't 
fix".
msg23795 - (view) Author: saravanand (saravanand) Date: 2005-01-05 10:29
Logged In: YES 
user_id=1181691

asdfas
msg23796 - (view) Author: saravanand (saravanand) Date: 2005-01-05 10:54
Logged In: YES 
user_id=1181691

I tried the following workaround which is working (causes no 
handle leaks)

Workaround is to change threading.semaphore to Windows 
Extension module APIs win32event.CreateMutex(),  
win32event.WaitForSingleObject and win32event.ReleaseMutex
()

After this change, there are no handle leaks. So, my 
questions are:
1) Is this workaround OK or are there any other issues 
regarding the win32api usage ?
2) you suggested to create minimal python wrappers for C++ 
code and call C++ from python (instead of C++ thread 
callbacks). So I would like to know,  in general, whether it is a 
bad idea for c++ threads to callback into Python. If yes, what 
are the issues (apart from the handle leak mentioned before). 
If no, I would like to live with the above workaround.

Thanks in advance
msg23797 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2005-01-06 07:19
Logged In: YES 
user_id=357491

What does Semaphore have to do with _DummyThread and 
currentThread?

And as for win32api, that is not part of the Python stdlib and thus I have 
no way of commenting on that.

For C++ threads to call into Python code, I would not say that is bad 
specifically.  You were just trying to get thread info for non-Python 
threads and that was leading to _DummyThread instances to be created 
as designed and you just didn't want that.  Calling Python code is fine, 
just don't expect it to know about your C++ threads as much as you 
seem to want it to.

And please leave this bug closed.  If free to submit a patch to change 
semantics, but that does not affect this bug report.
msg23798 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2005-01-06 17:16
Logged In: YES 
user_id=31435

Presumably, by changing threading.Semaphore to stop using 
any code from threading.py, then threading.currentThread() 
never gets called and so a _DummyThread is never created 
then.

I expect the reason a _DummyThread causes Event leaks is 
just that Thread.__init__ always ends up allocating a Python 
lock (Thread.__block), which allocates a Windows Event 
under the covers.

It *could* be that Thread.__block is never actually used for 
dummy threads, in which case we could avoid allocating it in 
that case (or could get rid of it right way in 
_DummyThread.__init__).  The dummy thread would still clog 
the _active dict, but wouldn't leak Events then.

It's certainly true that Python has no way to know when a 
thread it didn't start goes away.
msg23799 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2005-01-07 03:36
Logged In: YES 
user_id=357491

Ah, I didn't notice that Semaphore uses a Condition lock which uses an 
RLock which calls currentThread before and gets a lock from 
thread.allocate_lock which probably uses an Event on Windows.  I also 
noticed that if __debug__ is set than the _note method uses it as well.

It looks like Thread.__block can't be called for a _DummyThread since 
the only places self.__block is used is in Thread.__stop (which is called 
in Thread.__bootstrap which is called by Thread.start which will raise an 
AssertionError since Thread.__started will be set to True thanks to 
_DummyThread.__init__) and in Thread.join which is overridden in 
_DummyThread.  So it looks like deleting the key should be safe in 
_DummyThread.__init__.

Probably wouldn't hurt to delete self.__stderr while we are at it since it 
never gets used either and thus is basically a ref leak.  Sound good to 
you, Tim?
msg23800 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2005-01-07 04:15
Logged In: YES 
user_id=31435

Right, thread.allocate_lock() does allocate an Event on 
Windows.  On other platforms it allocates other kinds of 
limited resource (one way or another, it has to grab some 
kind of locking object from the operating system, and the 
supply of those is typically finite).

Looking it over, I agree Thread.__block won't be used by a 
dummy thread.  I also <wink> note that the use of "assert 
False" in _DummyThread.join() is plain bizarre.  It's a user 
error to try to join a dummy thread, not an internal invariant 
we believe *cannot* occur.  Guido misused (IMO) `assert` in 
several places in this module, but its use in join() is way over 
the edge.

Anyway, ya, I think it would be good to change this.  I don't 
really understand the point to nuking Thread.__stderr -- 
typically, sys.stderr is a single object over the life of a 
program, and doesn't go away until the program ends.  That 
is, I doubt the reference is keeping anything alive that would 
have gone away otherwise.
msg23801 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2005-01-08 02:47
Logged In: YES 
user_id=357491

OK, rev. 1.46 for 2.5 has the fix (now this bug is closed).  I can backport 
to 2.4 and 2.3, but it technically changes the interface and it is not a 
huge issue so I am not going to bother to backport unless people really 
feel it is necessary.

As for the assert use, yeah, that is odd.  =)  But that whole module could 
use a rewrite.  Probably a good thing to do for Python 3.

The reason I suggested the self.__stderr deletion is because there is a 
chance that someone set sys.stderr to a file or some other object and not 
to sys._stderr.  Thought it would be a more thorough fix, but not a big 
thing to me; just happened to notice it.
History
Date User Action Args
2022-04-11 14:56:08adminsetgithub: 41357
2004-12-22 10:07:52saravanandcreate