classification
Title: GIL destruction can fail
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, pitrou
Priority: normal Keywords:

Created on 2010-09-19 23:57 by pitrou, last changed 2010-09-20 22:53 by pitrou. This issue is now closed.

Messages (5)
msg116893 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-19 23:57
test_finalize_with_trace (in test_threading) sometimes fails because of failing to destroy the GIL (in _PyEval_FiniThreads()). This can be reproduced quite reliably by launching several copies in parallel:

$ ./python -m test.regrtest -j10 -F test_threading
[...]
test test_threading failed -- Traceback (most recent call last):
  File "/home/antoine/py3k/__svn__/Lib/test/test_threading.py", line 334, in test_finalize_with_trace
    "Unexpected error: " + ascii(stderr))
AssertionError: Unexpected error: b'Fatal Python error: pthread_mutex_destroy(gil_mutex) failed\n'


What happens is that pthread_mutex_destroy() fails with EBUSY. According to the POSIX man page:

“[EBUSY]
    The implementation has detected an attempt to destroy the object referenced by mutex while it is locked or referenced (for example, while being used in a pthread_cond_timedwait() or pthread_cond_wait()) by another thread.”


After a bit of tracing, it becomes clear that Py_Finalize() calls _PyEval_FiniThreads() while another thread is taking the GIL (take_gil()). Unfortunately, this is not a situation we can avoid, since we rely on process exit to kill lingering threads: arbitrary CPython code may still be running in parallel while we are finalizing interpreter structures.

Therefore, it is likely that _PyEval_FiniThreads() should avoid destroying the mutex at all. Indeed, if we destroy the mutex, it is possible that a lingering thread tries to retake the GIL after waking up from a system call (Py_END_ALLOW_THREADS), and fails because of another fatal error ("Fatal Python error: pthread_mutex_lock(gil_mutex) failed").
msg116913 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-09-20 06:40
looks similar to issue1856
msg116925 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-20 11:32
It is similar, but the issue is slightly different. Fixing issue1856 with the proposed resolution (the "stay_forever" flag) won't solve this, because the GIL mutex will still refuse to be destroyed if other threads reference it at the same time.
msg116926 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-20 11:42
Moving the _PyEval_FiniThreads() call to Py_Initialize() solves the issue:

diff -r 9e49082da463 Python/pythonrun.c
--- a/Python/pythonrun.c        Mon Sep 20 12:46:56 2010 +0200
+++ b/Python/pythonrun.c        Mon Sep 20 13:41:47 2010 +0200
@@ -217,8 +217,15 @@ Py_InitializeEx(int install_sigs)
         Py_FatalError("Py_Initialize: can't make first thread");
     (void) PyThreadState_Swap(tstate);
 
-    /* auto-thread-state API, if available */
 #ifdef WITH_THREAD
+    /* We can't call _PyEval_FiniThreads() in Py_Finalize because
+       destroying the GIL might fail when it is being referenced from
+       another running thread (see issue #9901).
+       Instead we destroy the previously created GIL here, which ensures
+       that we can call Py_Initialize / Py_Finalize multiple times. */
+    _PyEval_FiniThreads();
+
+    /* Auto-thread-state API */
     _PyGILState_Init(interp, tstate);
 #endif /* WITH_THREAD */
 
@@ -514,10 +521,6 @@ Py_Finalize(void)
 
     PyGrammar_RemoveAccelerators(&_PyParser_Grammar);
 
-#ifdef WITH_THREAD
-    _PyEval_FiniThreads();
-#endif
-
 #ifdef Py_TRACE_REFS
     /* Display addresses (& refcnts) of all objects still alive.
      * An address can be used to find the repr of the object, printed
msg116968 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-20 20:14
Committed in r84927.
History
Date User Action Args
2010-09-20 22:53:51pitrousetstatus: pending -> closed
2010-09-20 20:14:07pitrousetstatus: open -> pending
resolution: fixed
messages: + msg116968

stage: needs patch -> resolved
2010-09-20 11:42:07pitrousetmessages: + msg116926
2010-09-20 11:32:58pitrousetmessages: + msg116925
2010-09-20 06:40:52amaury.forgeotdarcsetmessages: + msg116913
2010-09-19 23:57:40pitroucreate