classification
Title: subprocess.run with stderr connected to a pipe won't timeout when killing a never-ending shell commanad
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: subprocess.run timeout does not function if shell=True and capture_output=True
View: 37424
Assigned To: Nosy List: haizaar, martin.panter, mjpieters
Priority: normal Keywords:

Created on 2017-04-24 10:31 by mjpieters, last changed 2019-07-01 00:07 by gregory.p.smith. This issue is now closed.

Messages (3)
msg292217 - (view) Author: Martijn Pieters (mjpieters) * Date: 2017-04-24 10:31
You can't time out a process tree that includes a never-ending process, *and* which redirects stderr:

cat >test.sh<<EOF
#!/bin/sh
cat /dev/random > /dev/null  # never-ending
EOF
chmod +x test.sh
python -c "import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)"

This hangs forever; the timeout kicks in, but then the kill on the child process fails and Python forever tries to read stderr, which won't produce data. See https://github.com/python/cpython/blob/v3.6.1/Lib/subprocess.py#L407-L410. The `sh` process is killed, but listed as a zombie process and the `cat` process has migrated to parent id 1:

^Z
bg
jobs -lr
[2]- 21906 Running                 bin/python -c "import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)" &
pstree 21906
-+= 21906 mjpieters bin/python -c import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)
 \--- 21907 mjpieters (sh)
ps -j | grep 'cat /dev/random'
mjpieters 24706     1 24704      0    1 R    s003    0:26.54 cat /dev/random
mjpieters 24897 99591 24896      0    2 R+   s003    0:00.00 grep cat /dev/random

Killing Python at that point leaves the `cat` process running indefinitely.

Replace the `cat /dev/random > /dev/null` line with `sleep 10`, and the `subprocess.run()` call returns after 10+ seconds:

cat >test.sh<<EOF
sleep 10
EOF
chmod +x test.sh
time bin/python -c "import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 403, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 707, in __init__
    restore_signals, start_new_session)
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 1326, in _execute_child
    raise child_exception_type(errno_num, err_msg)
OSError: [Errno 8] Exec format error

real	0m12.326s
user	0m0.041s
sys	0m0.018s

When you redirect stdin instead, `process.communicate()` does return, but the `cat` subprocess runs on indefinitely nonetheless; only the `sh` process was killed.

Is this something subprocess.run should handle better (perhaps by adding in a second timeout poll and a terminate())? Or should the documentation be updated to warn about this behaviour instead (with suitable advice on how to write a subprocess that can be killed properly).
msg292220 - (view) Author: Martijn Pieters (mjpieters) * Date: 2017-04-24 11:01
Apologies, I copied the wrong sleep 10 demo. The correct demo is:

cat >test.sh<<EOF
> #!/bin/sh
> sleep 10
> EOF
time bin/python -c "import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)"
Traceback (most recent call last):
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 405, in run
    stdout, stderr = process.communicate(input, timeout=timeout)
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 836, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 1497, in _communicate
    self._check_timeout(endtime, orig_timeout)
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 864, in _check_timeout
    raise TimeoutExpired(self.args, orig_timeout)
subprocess.TimeoutExpired: Command '['./test.sh']' timed out after 3 seconds

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 410, in run
    stderr=stderr)
subprocess.TimeoutExpired: Command '['./test.sh']' timed out after 3 seconds

real	0m10.054s
user	0m0.033s
sys	0m0.015s
msg292222 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-04-24 11:20
This is similar to the problem described in Issue 26534, which proposes “kill_group” and “killpg” APIs as a solution.

(FYI you should put a shebang at the start of the shell script, or call it as “sh -c test.sh”, to fix the “Exec format error”.)
History
Date User Action Args
2019-07-01 00:07:00gregory.p.smithsetstatus: open -> closed
superseder: subprocess.run timeout does not function if shell=True and capture_output=True
resolution: duplicate
stage: resolved
2019-06-27 13:12:00haizaarsetnosy: + haizaar
2019-06-27 08:37:07martin.panterlinkissue37424 superseder
2018-07-22 08:38:36martin.panterlinkissue31935 superseder
2017-04-24 11:20:28martin.pantersetnosy: + martin.panter
messages: + msg292222
2017-04-24 11:01:01mjpieterssetmessages: + msg292220
2017-04-24 10:31:14mjpieterscreate