classification
Title: subprocess is not thread-safe
Type: crash Stage:
Components: Library (Lib) Versions: Python 2.5
process
Status: closed Resolution: duplicate
Dependencies: Superseder: race condition in subprocess module
View: 1731717
Assigned To: Nosy List: Chris.Gerhard, gvanrossum, kjd@duda.org
Priority: normal Keywords:

Created on 2007-10-04 21:31 by kjd@duda.org, last changed 2010-11-08 16:26 by Chris.Gerhard. This issue is now closed.

Messages (2)
msg56230 - (view) Author: Kenneth Duda (kjd@duda.org) Date: 2007-10-04 21:31
The following test program crashes:

========================================
import threading, sys, subprocess

# subprocess._cleanup = lambda: None

def doit():
   for i in xrange(0, 1000):
      p = subprocess.Popen( "true" )
      p.wait()

t = threading.Thread( target=doit )
t.start()
doit()

==============================

It crashes because when one thread calls subprocess.Popen(), subprocess
calls this _cleanup() function, which might reap the subprocess started
in another thread !  The other thread might be inside
subprocess.Popen.wait(), just about to call waitpid(), and kill itself.

If you uncomment the commented line, then the program runs with no problems.

I imagine the purpose of _cleanup is to protect users from themselves,
i.e., protect a user who calls subprocess.Popen() a lot without ever
calling wait().  I suggest either:

  (1) eliminating this _cleanup() mechanism completely; people who do
not wait() deserve the zombies they get;
  (2) synchronizing _cleanup() with wait() through a lock; or,
  (3) having wait() simply retry if it gets ECHILD.  On the retry, it
will discover that returncode is set, and return normally.

-Ken
msg56246 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-10-05 18:32
This is a duplicate of bug# 1731717.

I asked Donovan Baarda, who told me:

"""
Last time I looked this had been fixed, admittedly in a bit of an ugly
way, on the svn head. I offered to do a patch to make it a bit cleaner,
but as it isn't really broken anymore it was a bit of a low priority and
I haven't done it.

This bug seems to have a good repeatable test-case that we can probably
use in the unittests to show that it's now fixed...
"""
History
Date User Action Args
2010-11-08 16:26:07Chris.Gerhardsetnosy: + Chris.Gerhard
2007-10-05 18:32:05gvanrossumsetstatus: open -> closed
resolution: duplicate
superseder: race condition in subprocess module
messages: + msg56246
nosy: + gvanrossum
2007-10-04 21:31:57kjd@duda.orgcreate