classification
Title: Deadlock during the "import" in the fork()'ed child process if fork() happened while import_lock was held
Type: behavior Stage:
Components: Interpreter Core Versions: Python 2.6, Python 2.5, Python 2.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: abaron, astrand, brett.cannon, gregory.p.smith, hdn, kosuha, loewis, michaeltsai, ronaldoussoren (9)
Priority: Keywords

Created on 2009-06-29 22:52 by hdn, last changed 2009-06-30 06:11 by hdn.

Messages (3)
msg89892 - (view) Author: Dmitriy Khramtsov (hdn) Date: 2009-06-29 22:52
Greetings,

The 2.4 and 2.5 versions of python contains a deadlock caused by
possibility to hold import_lock while doing fork() and not resetting it
in the child (on the linux platform).


The prove of concept code is:

--BEGIN (import_lock.py)--

#!/usr/bin/python2.4

import os
import time
import threading

class SecondThread(threading.Thread):
  def run(self):
    # Give the main thread time to hold import_lock and start importing.
    time.sleep(1)

    # Fork the process while holding import_lock in the main thread.
    pid = os.fork()

    if pid == 0:  # Child process
      print "child begin"

      # The import lock is still taken by main thread which is now not
the part
      # of the child process.  The import lock will never be released in the
      # child process.  Effectively, any import is a deadlock from now on.
      import types

      # This statement will never be executed.
      print "child end"

def main():
  second_thread = SecondThread()
  second_thread.start()

  # Take the import_lock and then release global interpreter lock in the
  # import_lock_helper module by calling any blocking operation.
  import import_lock_helper

  second_thread.join()

main()

--END (import_lock.py)--


--BEGIN (import_lock_helper.py)--

#!/usr/bin/python2.4

import time

# Release the global interpreter lock by calling any blocking operation.
time.sleep(10)

--END (import_lock_helper.py)--


The stack of the child python interpreter at the time of dead lock:

(gdb) bt
#0  0xffffe410 in __kernel_vsyscall ()
#1  0xf7f81700 in sem_wait@GLIBC_2.0 () from
/usr/grte/v1/lib/libpthread.so.0
#2  0x081ab500 in ?? ()
#3  0x080e1855 in PyThread_acquire_lock (lock=0x0, waitflag=1) at
../../Python/thread_pthread.h:313
#4  0x080d1f3b in lock_import () at ../../Python/import.c:247
#5  0x080d52a4 in PyImport_ImportModuleEx (name=0xf7e0f8f4 "types",
globals=0xf7def824, locals=0x8123cb8, fromlist=0x8123cb8) at
../../Python/import.c:1976
#6  0x080af2d0 in builtin___import__ (self=0x0, args=0xf7db7cd4) at
../../Python/bltinmodule.c:45
#7  0x08058d77 in PyObject_Call (func=0x0, arg=0xf7db7cd4, kw=0x0) at
../../Objects/abstract.c:1795
#8  0x080b30ec in PyEval_CallObjectWithKeywords (func=0xf7ddfd6c,
arg=0xf7db7cd4, kw=0x0) at ../../Python/ceval.c:3435
#9  0x080b5ca6 in PyEval_EvalFrame (f=0x8167a04) at
../../Python/ceval.c:2020
#10 0x080b942c in PyEval_EvalFrame (f=0x81ab57c) at
../../Python/ceval.c:3651
. . . .

(gdb) pystack
import_lock.py (26): run
/usr/lib/python2.4/threading.py (443): __bootstrap


The code directly responsible for import locking (Python/import.c):

--BEGIN--
static PyThread_type_lock import_lock = 0;
static long import_lock_thread = -1;
static int import_lock_level = 0;

static void
lock_import(void)
{
       long me = PyThread_get_thread_ident();
       if (me == -1)
               return; /* Too bad */
       if (import_lock == NULL) {
               import_lock = PyThread_allocate_lock();
               if (import_lock == NULL)
                       return;  /* Nothing much we can do. */
       }
       if (import_lock_thread == me) {
               import_lock_level++;
               return;
       }
       if (import_lock_thread != -1 ||
!PyThread_acquire_lock(import_lock, 0))
       {
               PyThreadState *tstate = PyEval_SaveThread();
               PyThread_acquire_lock(import_lock, 1);
               PyEval_RestoreThread(tstate);
       }
       import_lock_thread = me;
       import_lock_level = 1;
}

static int
unlock_import(void)
{
       long me = PyThread_get_thread_ident();
       if (me == -1 || import_lock == NULL)
               return 0; /* Too bad */
       if (import_lock_thread != me)
               return -1;
       import_lock_level--;
       if (import_lock_level == 0) {
               import_lock_thread = -1;
               PyThread_release_lock(import_lock);
       }
       return 1;
}

/* This function is called from PyOS_AfterFork to ensure that newly
  created child processes do not share locks with the parent. */

void
_PyImport_ReInitLock(void)
{
#ifdef _AIX
       if (import_lock != NULL)
               import_lock = PyThread_allocate_lock();
#endif
}
--END--


The possible solution is to reset import_lock in the
_PyImport_ReInitLock() not only for _AIX but also for Linux and maybe
other platforms (do you know why _AIX-only guard is there?).

--CUT HERE--
void
_PyImport_ReInitLock(void)
{
       if (import_lock != NULL)
               import_lock = PyThread_allocate_lock();
}
--CUT HERE--

Prove of concept example above works fine (w/o deadlocks) on the python
interpreter rebuilt with the _PyImport_ReInitLock() modification above.


Also this bug can be worked around in Python code by holding import_lock
before fork() and releasing import_lock right after fork() in both
parent and child.

The workaround code is:

--BEGIN (workaround_fork_import_bug.py)--

import imp
import os

def __fork():
  imp.acquire_lock()
  try:
    return _os_fork()
  finally:
    imp.release_lock()

try:
  _os_fork
except NameError:
  _os_fork = os.fork
  os.fork = __fork

--END (workaround_fork_import_bug.py)--


This workaround can also be implemented in Python interpreter in C and
could be other solution for this bug.


Thanks,
Dmitriy


$ uname -srvmpio
Linux 2.6.24-gg24-generic #1 SMP Wed Apr 22 21:48:06 PDT 2009 x86_64
unknown unknown GNU/Linux

P.S. The problem described above is probably causes (some) effects
described in http://bugs.python.org/issue1590864.
msg89907 - (view) Author: Martin v. Löwis (loewis) Date: 2009-06-30 05:40
Does the problem also exist in Python 2.6? We will definitely not fix it
anymore for 2.4 and 2.5.
msg89909 - (view) Author: Dmitriy Khramtsov (hdn) Date: 2009-06-30 06:11
> Does the problem also exist in Python 2.6? We will definitely not fix it
> anymore for 2.4 and 2.5.

Yep.  Exactly same problem in Python 2.6.

This problem does probably exist in all newer versions as well but I 
didn't explicitly test for that.
History
Date User Action Args
2009-06-30 06:11:54hdnsetmessages: + msg89909
versions: + Python 2.6
2009-06-30 05:40:35loewissetnosy: + loewis
messages: + msg89907
2009-06-29 23:02:10gregory.p.smithsetnosy: + gregory.p.smith
2009-06-29 22:52:13hdncreate