This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: subprocesses experience mysterious delay in receiving stdin EOF
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: yaaang
Priority: normal Keywords:

Created on 2011-02-16 23:04 by yaaang, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (2)
msg128699 - (view) Author: Yang Zhang (yaaang) Date: 2011-02-16 23:04
I'm seeing this issue on all the multiple Ubuntu 10.04 boxes I've tried.

I reduced a problem I was seeing in my application down into the following test case. In this code, a parent process concurrently spawns 2 (you can spawn more) subprocesses that read a big message from the parent over stdin, sleep for 5 seconds, and write something back. However, there's unexpected waiting happening somewhere, causing the code to complete in 10 seconds instead of the expected 5.

If you set verbose=True, you can see that the straggling subprocess is receiving most of the messages, then waiting for the last chunk of 3 chars---it's not detecting that the pipe has been closed. Furthermore, if I simply don't do anything with the second process (doreturn=True), the first process will never see the EOF.

Further down is some example output. When trying this out, you may need to increase the message size to see the behavior occur.

from subprocess import *
from threading import *
from time import *
from traceback import *
import sys
verbose = False
doreturn = False
msg = (20*4096+3)*'a'
def elapsed(): return '%7.3f' % (time() - start)
if sys.argv[1:]:
  start = float(sys.argv[2])
  if verbose:
    for chunk in iter(lambda: sys.stdin.read(4096), ''):
      print >> sys.stderr, '..', time(), sys.argv[1], 'read', len(chunk)
  else:
    sys.stdin.read()
  print >> sys.stderr, elapsed(), '..', sys.argv[1], 'done reading'
  sleep(5)
  print msg
else:
  start = time()
  def go(i):
    print elapsed(), i, 'starting'
    p = Popen(['python','stuckproc.py',str(i), str(start)], stdin=PIPE, stdout=PIPE)
    if doreturn and i == 1: return
    print elapsed(), i, 'writing'
    p.stdin.write(msg)
    print elapsed(), i, 'closing'
    p.stdin.close()
    print elapsed(), i, 'reading'
    p.stdout.read()
    print elapsed(), i, 'done'
  ts = [Thread(target=go, args=(i,)) for i in xrange(2)]
  for t in ts: t.start()
  for t in ts: t.join()

Example output:

  0.001 0 starting
  0.003 1 starting
  0.005 0 writing
  0.016 1 writing
  0.093 0 closing
  0.093 0 reading
  0.094 1 closing
  0.094 1 reading
  0.098 .. 1 done reading
  5.103 1 done
  5.108 .. 0 done reading
 10.113 0 done
msg128929 - (view) Author: Yang Zhang (yaaang) Date: 2011-02-21 01:42
After way too much time, I figured it out, after a quote from this post jumped out at me:

  See the "I/O on Pipes and FIFOs" section of pipe(7) ("man 7 pipe")

  "If all file descriptors referring to the write end of a pipe have
  been closed, then an attempt to read(2) from the pipe will see end-of-
  file (read(2) will return 0)."

I should've known this, but it never occurred to me - had nothing to do with Python in particular. What was happening was: the subprocesses were getting forked with open (writer) file descriptors to each others' pipes. As long as there are open writer file descriptors to a pipe, readers won't see EOF.

E.g.:

  p1=Popen(..., stdin=PIPE, ...)
    # creates a pipe the parent process can write to
  p2=Popen(...)
    # inherits the writer FD - as long as p2 exists, p1 won't see EOF

Turns out there's a close_fds parameter to Popen, so the solution is to pass close_fds=True. All simple and obvious in hindsight, but still managed to cost at least a couple eyeballs good chunks of time.
History
Date User Action Args
2022-04-11 14:57:13adminsetgithub: 55435
2011-02-21 01:42:21yaaangsetstatus: open -> closed

messages: + msg128929
resolution: not a bug
2011-02-16 23:04:10yaaangcreate