Created on 2010-09-19 23:57 by pitrou, last changed 2010-09-20 22:53 by pitrou. This issue is now closed.
|msg116893 - (view)||Author: Antoine Pitrou (pitrou) *||Date: 2010-09-19 23:57|
test_finalize_with_trace (in test_threading) sometimes fails because of failing to destroy the GIL (in _PyEval_FiniThreads()). This can be reproduced quite reliably by launching several copies in parallel: $ ./python -m test.regrtest -j10 -F test_threading [...] test test_threading failed -- Traceback (most recent call last): File "/home/antoine/py3k/__svn__/Lib/test/test_threading.py", line 334, in test_finalize_with_trace "Unexpected error: " + ascii(stderr)) AssertionError: Unexpected error: b'Fatal Python error: pthread_mutex_destroy(gil_mutex) failed\n' What happens is that pthread_mutex_destroy() fails with EBUSY. According to the POSIX man page: “[EBUSY] The implementation has detected an attempt to destroy the object referenced by mutex while it is locked or referenced (for example, while being used in a pthread_cond_timedwait() or pthread_cond_wait()) by another thread.” After a bit of tracing, it becomes clear that Py_Finalize() calls _PyEval_FiniThreads() while another thread is taking the GIL (take_gil()). Unfortunately, this is not a situation we can avoid, since we rely on process exit to kill lingering threads: arbitrary CPython code may still be running in parallel while we are finalizing interpreter structures. Therefore, it is likely that _PyEval_FiniThreads() should avoid destroying the mutex at all. Indeed, if we destroy the mutex, it is possible that a lingering thread tries to retake the GIL after waking up from a system call (Py_END_ALLOW_THREADS), and fails because of another fatal error ("Fatal Python error: pthread_mutex_lock(gil_mutex) failed").
|msg116913 - (view)||Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *||Date: 2010-09-20 06:40|
looks similar to issue1856
|msg116925 - (view)||Author: Antoine Pitrou (pitrou) *||Date: 2010-09-20 11:32|
It is similar, but the issue is slightly different. Fixing issue1856 with the proposed resolution (the "stay_forever" flag) won't solve this, because the GIL mutex will still refuse to be destroyed if other threads reference it at the same time.
|msg116926 - (view)||Author: Antoine Pitrou (pitrou) *||Date: 2010-09-20 11:42|
Moving the _PyEval_FiniThreads() call to Py_Initialize() solves the issue: diff -r 9e49082da463 Python/pythonrun.c --- a/Python/pythonrun.c Mon Sep 20 12:46:56 2010 +0200 +++ b/Python/pythonrun.c Mon Sep 20 13:41:47 2010 +0200 @@ -217,8 +217,15 @@ Py_InitializeEx(int install_sigs) Py_FatalError("Py_Initialize: can't make first thread"); (void) PyThreadState_Swap(tstate); - /* auto-thread-state API, if available */ #ifdef WITH_THREAD + /* We can't call _PyEval_FiniThreads() in Py_Finalize because + destroying the GIL might fail when it is being referenced from + another running thread (see issue #9901). + Instead we destroy the previously created GIL here, which ensures + that we can call Py_Initialize / Py_Finalize multiple times. */ + _PyEval_FiniThreads(); + + /* Auto-thread-state API */ _PyGILState_Init(interp, tstate); #endif /* WITH_THREAD */ @@ -514,10 +521,6 @@ Py_Finalize(void) PyGrammar_RemoveAccelerators(&_PyParser_Grammar); -#ifdef WITH_THREAD - _PyEval_FiniThreads(); -#endif - #ifdef Py_TRACE_REFS /* Display addresses (& refcnts) of all objects still alive. * An address can be used to find the repr of the object, printed
|msg116968 - (view)||Author: Antoine Pitrou (pitrou) *||Date: 2010-09-20 20:14|
Committed in r84927.
|2010-09-20 22:53:51||pitrou||set||status: pending -> closed|
|2010-09-20 20:14:07||pitrou||set||status: open -> pending|
messages: + msg116968
stage: needs patch -> committed/rejected
|2010-09-20 11:42:07||pitrou||set||messages: + msg116926|
|2010-09-20 11:32:58||pitrou||set||messages: + msg116925|
|2010-09-20 06:40:52||amaury.forgeotdarc||set||messages: + msg116913|