New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
subprocess.run timeout does not function if shell=True and capture_output=True #81605
Comments
Consider the following: subprocess.run('sleep 10', shell=True, timeout=.1, capture_output=True) It should raise after 0.1 seconds, but it does not - it waits 10 seconds till sleep finishes and only then raises "subprocess.TimeoutExpired: Command 'sleep 10' timed out after 0.1 seconds" Removing 'capture_output=True' or converting command string to list (and removing shell=True) makes it work. I'm using Python 3.7.3 on Ubuntu 16.04. Reproduces on official docker Python 3.7.3 image alpine3.8. |
On mac this exits immediately with 0.1 seconds as timeout but reproducible on Ubuntu with master. |
Same thing going on as in bpo-30154. The shell is probably spawning the “sleep” command as a child process (grandchild of Python), and waiting for it to exit. When Python times out, it will kill the shell process, leaving the grandchild as an orphan. The “sleep” process will still be running and probably holds the “stdout” and/or “stderr” pipes open, and Python will wait indefinitely to be sure it has captured all the output to those pipes. Also see bpo-26534 proposes APIs to kill a process group rather than the single child process. |
Thanks for looking at it. My original code had "tar" running, which is a child of the shell as well... I assume running exec in the shell may help somewhat, but not a cure obviously. I'm all for killing the process group. "Run something and get it's about" should be simple enough without requiring a programmer to know all POSIX process semantics. |
bpo-30154 that I've marked as a duplicate demonstrates this problem without using shell=True. The solution I proposed handles that via the additional small timeout on the cleanup side, but still has the caveat that the grandchild processes keep running unless the caller used start_new_session=True. See the PR. We cannot reasonably determine when start_new_session=True should be a default behavior. And I worry that doing it when it should not be will cause unexpected new problems with existing code. |
On Windows, the following pattern _can_ hang: proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
try:
return proc.communicate(timeout=1.0)
except TimeoutExpired:
proc.kill()
return proc.communicate() # <== HERE Even if the first child process is killed, communicate() waits until the pipe is closed. If the child process spawned a 3rd process before being killed, the second .communicate() calls hangs until the 3rd process exit or close its stdout. I'm not sure if subprocess.run() should do anything about this case, but it was at least for your information. I'm fighting against this issue in bpo-37531. IMHO it's an issue of the Windows implementation of Popen.communicate(): it's implemented with a blocking call to stdout.read() run in a thread. The thread cannot be interrupted in any way and will until complete once stdout is closed. Again, if the direct child process spawned other processes, stdout is only closed in the main parent process once all child processes exit or at least closed their stdout. Maybe another issue should be opened to avoid blocking with the following pattern on Windows: proc.kill()
proc.communicate() |
Thanks. I believe this issue is fixed but you've identified follow-on issues. lets follow up on those in their own bugs. |
I created bpo-38207 "subprocess: On Windows, Popen.kill() + Popen.communicate() is blocking until all processes using the pipe close the pipe" to track this issue. |
I'm still seeing hangs with subprocess.run() in Python 3.7.4 |
Using Python 3.7.4, I'm calling subprocess.run() with the following arguments. .run() still hangs even though a timeout is being passed in. subprocess.run(cmd_list, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=False, timeout=timeout_val, check=True, universal_newlines=True) cmd_list contains the name of the bash script below, which is ------------------------------------------------------------------ echo Rescanning system for PCIe devices echo "Rescan device" sleep 5 if [ echo Rescan Done This script is scanning for NVME SSDs, so duplicating the issue is not as straightforward as submitting a python script. The OS is CentOS 7. uname -a shows I know the Kernel is old, but we have a restriction against updating it. |
That's not surprising, the fix has been pushed at 2019-09-11. Python 3.7.5 will include the fix and it will be released soon: |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: