classification
Title: Fix for 754449: hiding exc in threading during interp shutdn
Type: Stage:
Components: Library (Lib) Versions: Python 2.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: brett.cannon Nosy List: brett.cannon, tim.peters
Priority: normal Keywords: patch

Created on 2004-05-16 18:43 by brett.cannon, last changed 2004-07-03 20:00 by brett.cannon. This issue is now closed.

Files
File name Uploaded Description Edit
exc_threading_fix2.diff brett.cannon, 2004-07-03 01:11 diff'ed on 2004-07-02: fixes stderr assignment typo
Messages (9)
msg45979 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2004-05-16 18:43
Decided to start a fresh tracker item for this patch since I already 
attempted to close bug 754449 once before and I want to make 
sure the two fixes stay separate.

Someone emailed me a program whose testing suite could reliably 
reproduce the problem as reported in the bug report (for a quck 
refresher, basically if someone leaves a threading.Thread instance 
running while the interpreter is being shutdown and an exception is 
raised in the thread then that exception will cause  *another* 
exception in the subsequent attempt to report the exception from 
the thread because all globals have been set to None at that point).  
The previous fix that I checked in turned out not to be thorough 
enough (still needed, though).  I am currently waiting for the 
person who emailed me the triggering code to let me know if I can 
publish how to cause the error.

Attached is a patch to remove all instances of accessing globals 
(just as if I was writing a __del__ method for threading.Thread) by 
storing copies of sys.stderr, sys.exc_info, and sys.tracebacklimit.  
All of this is so that a decent approximation of a traceback print-
out can occur without having to call the 'traceback' module since its 
functions depend a lot on each other.

I mainly need someone else who is more familiar with dealing with 
tracebacks to make sure that I have not done something stupid and 
to make sure this is the best way to handle things.
msg45980 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2004-05-17 04:37
Logged In: YES 
user_id=357491

Just posted how to reproduce the bug using the in-dev version of mnet as 
a post to the bug tracker item.
msg45981 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2004-07-01 04:33
Logged In: YES 
user_id=357491

Uploading a new version of the patch to store one of the variables in the 
class instead of the instance since it it never expected to change.  Also 
changed the comment.

Still waiting for someone else to take a look at this code before I check it 
in.
msg45982 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-07-02 03:04
Logged In: YES 
user_id=31435

Brett, it looks like self.__stderr is never given a value -- the 
__init__ code here binds a *local* vrbl __stderr to 
_sys.stderr.  If so, that tells me the bulk of the patch hasn't 
actually been exercised (else it would have raised 
AttributeError trying to print to self.__stderr).

To simplify it more, I think it would be fine to ignore 
sys.tracebacklimit.  A Google search shows that virtually the 
only use of sys.tracebacklimit in real life is the use made of it 
here:  passively accessing its value to simulate what it would 
have done if anyone had ever bothered to set it (of which I 
couldn't find even one example).
msg45983 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2004-07-03 01:11
Logged In: YES 
user_id=357491

You're right on the self.__stderr assignment snafu; typo from when I did 
the last change to move exc_info to the class instead of the instance.  I 
hate running the Mnet test suite to trigger the bug since it is long and is 
a major resource hog.  Fixed the bug, though, and ran the testing suite; 
looks good.  Here is a sample output:

Exception in thread The Asyncore Poller (most likely raised during 
interpreter shutdown):
Traceback (most recent call last):
  File "/Users/drifty/Code/compiled_CVS/lib/python2.4/threading.py", line 
442, in __bootstrap
  File "/Users/drifty/Code/compiled_CVS/lib/python2.4/threading.py", line 
422, in run
  File "/Users/drifty/Code/Python/mnet/repos/mnet/pyutil/Asyncore.py", 
line 133, in _asyncoreloop
  File "/Users/drifty/Code/Python/mnet/repos/mnet/pyutil/debugprint.py", 
line 59, in write
exceptions.TypeError: 'NoneType' object is not callable

As you will notice the output isn't exactly standard (extra warning about 
how exception could have come about and the addition of "exceptions" 
for the listing of what the exception was), but nothing glaring.  Is it even 
worth trying to ditch the appending of "exception"?  It would be another 
thing to store in the class (types.ClassType if I follow traceback.py).

I also took your advice and ripped out the traceback use.  So now the 
class stores a reference to _sys.exc_info and the instance stores a ref to 
_sys.stderr .
msg45984 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-07-03 02:27
Logged In: YES 
user_id=31435

IMO, this is such a huge improvement over the status quo 
that it's not worth worrying about niggling format differences 
between this output and "the usual" traceback output.  The 
*information* is the important thing, and this patch is the 
difference between no info and all the info there is.  Check it 
in!
msg45985 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2004-07-03 03:56
Logged In: YES 
user_id=357491

In as rev. 1.42  (and two commits to Misc/NEWS; pain to explain this 
thing clearly).

Do you think this should be backported?  It should apply cleanly, but it 
does add a class and instance attribute, so the answer doesn't seem 
obvious to me.  I say "yes", though.
msg45986 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2004-07-03 04:07
Logged In: YES 
user_id=31435

I never care about old versions of Python, and 2.3 is ancient 
to me now <wink>.  But the lack of sane error messages 
when threads are dying is certainly a serious bug, and this is 
certainly a bugfix -- it's a fine candidate for backporting.
msg45987 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2004-07-03 20:00
Logged In: YES 
user_id=357491

Rev. 1.38.6.1 has the patch along with a backport of rev. 1.41 from HEAD 
(my first attempt at this that fell short).

Thanks for the help on this, Tim.
History
Date User Action Args
2004-05-16 18:43:48brett.cannoncreate