Issue465673
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2001-09-27 14:36 by anonymous, last changed 2022-04-10 16:04 by admin. This issue is now closed.
Messages (12) | |||
---|---|---|---|
msg6718 - (view) | Author: Nobody/Anonymous (nobody) | Date: 2001-09-27 14:36 | |
I've been playing around with Python and threads, and I've noticed some odd and often unstable behavior. In particular, on my Solaris 8 box I can get Python 1.5.2, 1.6, 2.0, or 2.1 to core dump every time with the following sequence. I've also seen this happen on Solaris 6 (all UltraSPARC based): 1. Enter the following code into the interactive interpreter: -- import threading def loopingfunc(): while 1: pass threading.Thread(target=loopingfunc).start() -- 2. Send a SIGINT signal (usually Ctrl-C, your terminal settings may vary). "Keyboard Interrupt" is displayed and so far everything looks fine. 3. Now simply press the <Enter> key to enter a blank line in the interpreter. For my Solaris 8 box with the GNU readline 2.2 module present, this always ends up in a core dump. It may take a while, since at this point the readline signal handler is being re-entered recursively until the stack overflows. I've described this problem in the past on Usenet, but didn't get much response. For a more complete discussion of the problem and a possible solution, see http://groups.google.com/groups?hl=en&threadm=98osml%24sul%241%40newshost.mot.com&rnum=1&prev=/groups%3Fas_ugroup%3Dcomp.lang.python%26as_uauthors%3DJason%2520Lowe (If the URL doesn't work, search groups.google.com for posts by "Jason Lowe" in comp.lang.python and view the entire thread of the result.) Upon investigation of the problem, it looks like the problem is caused by an interaction with pthreads and signals. The SIGINT signal is delivered to the thread that is performing the spin loop, NOT the thread that is in the readline() module. Because the readline module uses setjmp()/longjmp() for its signal handling, the longjmp() ends up being executed by the wrong thread with dire results. Pthreads and signals don't mix very well, so one has to be very careful to make sure everything works properly. A typical solution is to ensure signals are only delivered to one thread by masking all signals in all other threads. I believe this will be the same root cause of bug #219772 (Interactive InterPreter+ Thread -> core dump at exit). I was able to solve the problem by modifying Python/thread_pthread.h's PyThread_start_new_thread() to block all signals with pthread_sigmask() after the new thread was started. This causes all threads created by Python except the initial thread to have all signals masked. This forces signals to be delivered to the main thread. I don't believe anyone is depending on the current behavior that signals will be delivered to an indeterminate thread, so this change seems safe. However I haven't run many other Python applications that deal with threads and signals. I propose that on platforms that implement Python threads with pthreads, the code masks all signals in all threads except the initial, main thread. This will resolve the problem of signals being delivered to threads indeterminately. I think I can dig up my initial code deltas if desired, or I can always recreate them. It's just a few lines to mask signals in the thread before thread creation, then restore them afterwards. (This causes only the main thread to have signals preserved.) A side question from this is whether the thread module (or posix module?) should expose the pthread_sigmask() functionality to Python threads on a platform that uses pthreads. This would allow developers to manipulate the signal masks of the Python threads so that a particular signal can be routed to a particular thread. (They would mask this signal in all other threads except the desired thread.) |
|||
msg6719 - (view) | Author: Jason Lowe (jasonlowe) | Date: 2001-09-27 14:40 | |
Logged In: YES user_id=56897 Ack. SourceForge wants to log me out every few minutes, so I wasn't logged in when I submitted this. Sorry 'bout that. |
|||
msg6720 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2001-09-28 16:08 | |
Logged In: YES user_id=6380 I don't have Solaris access, and I can't get this to break on Linux. But I agree with your suggestion that posix threads should block signals. Are you capable of coming up with a patch that does that, in a way that is independent of the specific platform (as long as it has PTHREADS)? You may have to open a new issue in the patch manager, since SF doesn't allow after-the-fact attachments to anonymous entries. (Maybe SF logs you out whenever you quit your browser? That's what it does for me. :-) |
|||
msg6721 - (view) | Author: Jason Lowe (jasonlowe) | Date: 2001-10-03 14:32 | |
Logged In: YES user_id=56897 I'm working on a patch now. Unfortunately, I only have access to Solaris and Linux right now, but I'll test the patch on those. I might be able to scrounge up an HPUX machine as well. I'll post more info as I get it. Unfortunately, it appears I have to poll this issue for updates, so I might not respond right away to comments. The 'monitor' feature doesn't seem to work for me, among many other SourceForge things. If I wait about 3 minutes, SourceForge wants me to log back in if I click anything and I never seem to get any email notifications (but my email address listed for my account is correct). Weird. |
|||
msg6722 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2001-10-03 15:58 | |
Logged In: YES user_id=6380 To get email (to your @users.sf.net account), click on the Monitor button that appears at the top of the bug entry when you're logged in to SF. |
|||
msg6723 - (view) | Author: Jason Lowe (jasonlowe) | Date: 2001-10-05 15:38 | |
Logged In: YES user_id=56897 I've checked the Monitor button while logged in, but I still do not receive email notification of updates. When I clicked it again, it said I was no longer monitoring, so I clicked it yet again back into monitoring mode. Apparently SF knows I'm monitoring it, but it still doesn't send me email. Mail to my @users.sf.net account does work, so I'm at a loss to explain why a) My login cookie doesn't stick around very long at all and b) Why I never get monitor email from SF Re: the patch, I have something that works well on Solaris. I'll try it on Linux today, but I don't have access to an HP-UX system. I'm a little concerned about the impact to HP-UX (pre 11.0 and post 11.0) and AIX, and I don't have access to those machines to check out those concerns. Hopefully I'll have the patch posted by today. |
|||
msg6724 - (view) | Author: Jason Lowe (jasonlowe) | Date: 2001-10-05 17:09 | |
Logged In: YES user_id=56897 I've submitted the patch for pthread signal masking. My biggest concerns are the guesses I made for DCE threads and whether they will work for AIX which might need to use sigthreadmask(). Regarding reproducing this on Linux, I was able to get Linux to crash if I held down Ctrl-C (with fairly fast key repeat). After starting the spinning thread, Python would crash on Linux under a storm of SIGINTs within 30 seconds or so. Without the spinning thread, I couldn't get it to crash. With the patch applied, the spinning thread running during the storm of SIGINTs wouldn't crash it. So that implies the signal masking is doing something good even in the Linux case. Re: my SF problems, I submitted a few support requests. Hopefully something gets fixed. |
|||
msg6725 - (view) | Author: Jason Lowe (jasonlowe) | Date: 2001-10-09 18:40 | |
Logged In: YES user_id=56897 Patch is #468347 [mask signals for non-main pthreads] |
|||
msg6726 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2001-10-12 21:52 | |
Logged In: YES user_id=6380 Since I've now applied your patch, I presume this is fixed, and I'm closing the bug report. Let me know if there are still problems. |
|||
msg6727 - (view) | Author: Jason Lowe (jasonlowe) | Date: 2001-10-16 16:59 | |
Logged In: YES user_id=56897 I'll grab the 2.2b1 release when it is available and test it on the Solaris and Linux configurations we have. |
|||
msg6728 - (view) | Author: Jason Lowe (jasonlowe) | Date: 2001-10-24 14:41 | |
Logged In: YES user_id=56897 I've verified Python 2.2b1 fixes the thread-signal interaction on Solaris 6, Solaris 8, and RedHat Linux 7.1. Thanks for the quick patch application! |
|||
msg6729 - (view) | Author: Guido van Rossum (gvanrossum) * | Date: 2001-10-24 14:46 | |
Logged In: YES user_id=6380 Thanks for the followup! |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-10 16:04:28 | admin | set | github: 35243 |
2001-09-27 14:36:14 | anonymous | create |