You can't time out a process tree that includes a never-ending process, *and* which redirects stderr:
cat >test.sh<<EOF
#!/bin/sh
cat /dev/random > /dev/null # never-ending
EOF
chmod +x test.sh
python -c "import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)"
This hangs forever; the timeout kicks in, but then the kill on the child process fails and Python forever tries to read stderr, which won't produce data. See https://github.com/python/cpython/blob/v3.6.1/Lib/subprocess.py#L407-L410. The `sh` process is killed, but listed as a zombie process and the `cat` process has migrated to parent id 1:
^Z
bg
jobs -lr
[2]- 21906 Running bin/python -c "import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)" &
pstree 21906
-+= 21906 mjpieters bin/python -c import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)
\--- 21907 mjpieters (sh)
ps -j | grep 'cat /dev/random'
mjpieters 24706 1 24704 0 1 R s003 0:26.54 cat /dev/random
mjpieters 24897 99591 24896 0 2 R+ s003 0:00.00 grep cat /dev/random
Killing Python at that point leaves the `cat` process running indefinitely.
Replace the `cat /dev/random > /dev/null` line with `sleep 10`, and the `subprocess.run()` call returns after 10+ seconds:
cat >test.sh<<EOF
sleep 10
EOF
chmod +x test.sh
time bin/python -c "import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 403, in run
with Popen(*popenargs, **kwargs) as process:
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 707, in __init__
restore_signals, start_new_session)
File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 1326, in _execute_child
raise child_exception_type(errno_num, err_msg)
OSError: [Errno 8] Exec format error
real 0m12.326s
user 0m0.041s
sys 0m0.018s
When you redirect stdin instead, `process.communicate()` does return, but the `cat` subprocess runs on indefinitely nonetheless; only the `sh` process was killed.
Is this something subprocess.run should handle better (perhaps by adding in a second timeout poll and a terminate())? Or should the documentation be updated to warn about this behaviour instead (with suitable advice on how to write a subprocess that can be killed properly).
|