classification
Title: Poll call in multiprocessing/forking.py is not thread safe. Results in "OSError: [Errno 10] No child processes" exceptions.
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.6, Python 2.5
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: Gary.Yee, brian.curtin
Priority: normal Keywords:

Created on 2011-04-20 17:17 by Gary.Yee, last changed 2011-04-20 18:26 by brian.curtin. This issue is now closed.

Files
File name Uploaded Description Edit
foo.py Gary.Yee, 2011-04-20 17:17 Test script
Messages (4)
msg134165 - (view) Author: Gary Yee (Gary.Yee) Date: 2011-04-20 17:17
Background: 

I'm using multiprocessing not to run jobs in parallel, but to run functions in a different process space so they can be done as a different user.  I am thus using multiprocessing in a multithreaded (Linux) application.

Problem:

In multiprocessing/forking.py the poll() function is not thread safe.  If multiple threads call poll() you could have two back-to-back calls to os.waitpid() on the same PID (this happens frequently when multiprocessing's _cleanup() function is called).

Traceback (most recent call last):
  File "/opt/scyld/foo.py", line 178, in call
    pool = Pool(processes=1)
  File "/opt/scyld/python/2.6.5/lib/python2.6/multiprocessing/__init__.py", line 227, in Pool
    return Pool(processes, initializer, initargs)
  File "/opt/scyld/python/2.6.5/lib/python2.6/multiprocessing/pool.py", line 104, in __init__
    w.start()
  File "/opt/scyld/python/2.6.5/lib/python2.6/multiprocessing/process.py", line 99, in start
    _cleanup()
  File "/opt/scyld/python/2.6.5/lib/python2.6/multiprocessing/process.py", line 53, in _cleanup
    if p._popen.poll() is not None:
  File "/opt/scyld/python/2.6.5/lib/python2.6/multiprocessing/forking.py", line 106, in poll
    pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes


Suggested Fix:

Wrap the os.waitpid() call in a try/except block looking for OSError 10 exceptions and return the returncode currently available in that event.  The one potential problem this introduces is if someone calls os.waitpid() on that PID on the process without going through forking.py.  This will result in self.returncode never being set to a non-None value.  If you're using the multiprocessing module to create processes, however, you should be also using it to clean up after itself.

I've attached a test file.
msg134166 - (view) Author: Gary Yee (Gary.Yee) Date: 2011-04-20 17:21
Here's how I changed poll() in multiprocessing/forking.py:

        def poll(self, flag=os.WNOHANG):
            if self.returncode is None:
                try:
                    pid, sts = os.waitpid(self.pid, flag)
                except OSError, e:
                    if e.errno == 10:
                        return self.returncode
                    else:
                        raise
                if pid == self.pid:
                    if os.WIFSIGNALED(sts):
                        self.returncode = -os.WTERMSIG(sts)
                    else:
                        assert os.WIFEXITED(sts)
                        self.returncode = os.WEXITSTATUS(sts)
            return self.returncode
msg134168 - (view) Author: Gary Yee (Gary.Yee) Date: 2011-04-20 17:50
It looks like this has been addressed in the svn trunk as part of bug: #1731717.  I was using version 2.6.5.  Any chance that this gets backported?
msg134170 - (view) Author: Brian Curtin (brian.curtin) * (Python committer) Date: 2011-04-20 17:53
2.6 is only receiving security fixes at the moment, so it won't make it into there.
History
Date User Action Args
2011-04-20 18:26:23brian.curtinsetstatus: open -> closed
stage: resolved
resolution: out of date
versions: - Python 3.4
2011-04-20 17:53:39brian.curtinsetnosy: + brian.curtin
messages: + msg134170
2011-04-20 17:50:55Gary.Yeesetmessages: + msg134168
2011-04-20 17:21:47Gary.Yeesetmessages: + msg134166
versions: + Python 2.6, Python 2.5, Python 3.4, - Python 3.1, Python 2.7, Python 3.2, Python 3.3
2011-04-20 17:19:08brian.curtinsettype: crash -> behavior
versions: - Python 2.6, Python 2.5, Python 3.4
2011-04-20 17:17:35Gary.Yeecreate