classification
Title: proc communicate not exiting on python subprocess timeout using PIPES
Type: behavior Stage:
Components: Interpreter Core Versions: Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Leonardo Francalanci, eryksun, martin.panter, r.david.murray
Priority: normal Keywords:

Created on 2017-09-13 10:11 by Leonardo Francalanci, last changed 2017-09-18 22:35 by eryksun.

Messages (15)
msg302051 - (view) Author: Leonardo Francalanci (Leonardo Francalanci) Date: 2017-09-13 10:11
the script below (a python process is called, which calls a waitfor cmd with a timeout of 4 seconds) is supposed to end after 4 seconds. But instead proc.communicate stops after the 20 seconds timeout.

Everything works 100% ok if I remove the stdin/stderr/stdout parameters...


if __name__ == "__main__":
  #start a python process which will wait for 4 seconds and then exit (waitfor is set to 200):
  proc_str = ["C:\\Program Files (x86)\\Anaconda3\\Python.exe",
                        "-c", "import 
  subprocess;subprocess.run('cmd /S /C waitfor g /t 200', shell=False, timeout=4)"]
  proc = subprocess.Popen(proc_str,
                            stdin=subprocess.PIPE,
                            stderr=subprocess.PIPE,
                            stdout=subprocess.PIPE,
                            shell=False,
                            universal_newlines=True)
  #this should exit in 4 seconds (when the called process exits), but instead exits after 20 seconds:
  (proc_out, proc_err) = proc.communicate(timeout=20)
msg302059 - (view) Author: Eryk Sun (eryksun) * Date: 2017-09-13 11:24
I don't see a bug here. The inner run() kills cmd.exe after 4 seconds. But the waitfor.exe process waits for 200 seconds and has handles for the pipes. The OS won't close out the pipes as long as there are potential writers, so the outer communicate() waits the full 20 seconds.
msg302062 - (view) Author: Leonardo Francalanci (Leonardo Francalanci) Date: 2017-09-13 12:23
The called "C:\\Program Files (x86)\\Anaconda3\\Python.exe" process exits after 4 seconds. The reason why it ends shouldn't matter, right? I expect that a call to communicate should exit as soon as the called process is not running anymore. I don't hold a pipe with "waitfor", I hold it with python.exe.
msg302067 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-09-13 12:41
I’m not familiar with Windows commands or processes enough, but going by Erik’s comment, this sounds similar to Issue 26534 and Issue 30154. Doesn’t your “waitfor” command inherit the same parent process’s stdout etc pipes?
msg302068 - (view) Author: Leonardo Francalanci (Leonardo Francalanci) Date: 2017-09-13 12:52
thank you for your replies!

I used waitfor because it's the simplest and shortest way to have reproducible code. The issue I'm having is obviously not with waitfor, but with another exe, but this doesn't change the fact that I need a way to exit as soon as the called process exits or the timeout ends. I don't care what the called process does. I know that the process exited, and the docs for proc.communicate say "Wait for process to terminate". Well, the process terminates, but "communicate" doesn't exit. It doesn't say "communicate will hang as long as the pipes are open".
Plus: the "python.exe" process doesn't actually timeout!!! It just exits (with exception, but that's not important), and I would like to get the error from stderr! While I can't get it because "communicate" raises an exception...
msg302071 - (view) Author: Leonardo Francalanci (Leonardo Francalanci) Date: 2017-09-13 12:58
(forgot: Issue 26534 is about shell=True, I use shell=False, right?)
msg302121 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-09-13 21:34
.
Isn’t your use of “cmd” similar enough to shell=True? I.e. isn’t that a “cmd” parent process spawning a “waitfor” child? If your 4 s “subprocess.run” call times out, does it kill the “waitfor” process, or leave it running?

Could the “waitfor” process write to stdout or stderr, and should that be captured by your 20 s “communicate” call?
msg302130 - (view) Author: Eryk Sun (eryksun) * Date: 2017-09-13 23:11
> the process terminates, but "communicate" doesn't exit. It doesn't 
> say "communicate will hang as long as the pipes are open".

I think the problem with leaked handles does warrant a note. It can be a difficult problem to diagnose otherwise.

If communicate() raises subprocess.TimeoutExpired, it can be called again after the caller takes whatever action is necessary to resolve the problem, such as killing a process group as proposed by issue 5115.

> the "python.exe" process doesn't actually timeout!!! It just exits 
> (with exception, but that's not important), and I would like to get
> the error from stderr! While I can't get it because "communicate" 
> raises an exception...

subprocess.run() in the child does time out and raise subprocess.TimeoutExpired. For the parent to read stdout and stderr to EOF, all pipe handles with write access have to close. Calling communicate() again will succeed once waitfor.exe either exits or gets terminated. 

Since python.exe and waitfor.exe are console processes, and waitfor.exe inherits Python's console, you can send it a Ctrl+Break event to make it exit. Make sure to create a new process group in the Popen() call by passing the argument creationflags=subprocess.CREATE_NEW_PROCESS_GROUP. Then to deal with a timeout, call proc.send_signal(signal.CTRL_BREAK_EVENT) and retry communicate(). 

(Yes, it's wrong that send_signal sends Ctrl+Break to a process group instead of explicitly requiring a `group` argument to enable this, as proposed in issue 5115. I guess whoever implemented os.kill for Windows didn't understand that GenerateConsoleCtrlEvent should have been used to implement os.killpg instead.)

> I don't care what the called process does. 

If you have source control of the child process and don't care about the output of grandchild processes, redirect their standard I/O to the NUL device. Also, make the child's standard I/O files non-inheritable, or if you're using C/C++, call CreateProcess with a PROC_THREAD_ATTRIBUTE_HANDLE_LIST to control handle inheritance (this should be available in Python 3.7, pending issue 19764).
msg302146 - (view) Author: Leonardo Francalanci (Leonardo Francalanci) Date: 2017-09-14 06:42
I have a workaround, and I guess this means there's a bug in the current implementation of stdout/stderr=subprocess.PIPE; if I open my own pipes instead of using subprocess.PIPE everything seems to work (provided I close the pipe before reading from it):

(errPipeR, errPipeW) = os.pipe();
(outPipeR, outPipeW) = os.pipe();

proc = subprocess.Popen(proc_str,
                                    stdin=subprocess.PIPE,
                                    stderr=errPipeW,
                                    stdout=outPipeW,shell=False,
                                    universal_newlines=True)

proc.communicate(timeout=20)
os.close(outPipeW)
os.close(errPipeW)

Now I can read from the "R" pipes with 0 problems (getting the error and output streams), and proc.communicate exits as soon as the called process exits (4 seconds).

So, since I can use my own pipes without any problems, I don't see why the regular "subprocess.PIPE" shouldn't be working...
msg302155 - (view) Author: Eryk Sun (eryksun) * Date: 2017-09-14 07:53
> Now I can read from the "R" pipes with 0 problems (getting the error 
> and output streams)

The pipe handles are inherited by waitfor.exe, just as before. You'll find that reading to EOF will block just as it does when using subprocess.PIPE with communicate(). For example:

    stderr = open(errPipeR, 'r')
    err = stderr.read()

The above stderr.read() will return when waitfor.exe either exits or gets terminated.

> and proc.communicate exits as soon as the called process exits 
> (4 seconds).

proc.stdout and proc.stderr are None in this case, so all communicate() can do is write to proc.stdin, if there's input text, and then wait() for the process to exit.
msg302240 - (view) Author: Leonardo Francalanci (Leonardo Francalanci) Date: 2017-09-15 07:32
I'm really sorry, you are 100% correct: it blocks on the pipe (my tests killed the process tree before reading from the pipes).
Still, I think there should be a way to actually read the output also in this case... works for me when I kill the whole process stack.
msg302254 - (view) Author: Leonardo Francalanci (Leonardo Francalanci) Date: 2017-09-15 13:50
can I at least change the call to:

subprocess.run('cmd /S /C waitfor g /t 200', shell=False, timeout=4)

in any way to avoid the problem?
I tried with stdin=subprocess.DEVNULL,stdout=subprocess.DEVNULL,stderr=subprocess.DEVNULL; also with close_fds=True, but nothing changes...
msg302281 - (view) Author: Eryk Sun (eryksun) * Date: 2017-09-15 18:03
> I tried with stdin=subprocess.DEVNULL, stdout=subprocess.DEVNULL, 
> stderr=subprocess.DEVNULL

As I said previously, you also need to make the current standard handles non-inheritable. Pending issue 19764, in 3.7 you'll be able to override stdin, stdout, and stderr with the default close_fds=True. Currently overriding them implicitly sets close_fds=False, in which case you need to manually ensure that the current standard handles (i.e. the pipe handles) can't be inherited. For example:

    os.set_inheritable(0, False)
    os.set_inheritable(1, False)
    os.set_inheritable(2, False)
msg302450 - (view) Author: Leonardo Francalanci (Leonardo Francalanci) Date: 2017-09-18 13:16
That works!
But when I said "also with close_fds=True", I meant that I tried WITHOUT overriding stdin, stdout, and stderr AND setting close_fds=True, but it didn't work. What worked was not overriding stdin/out/err and adding

os.set_inheritable(0, False)
os.set_inheritable(1, False)
os.set_inheritable(2, False)

before the call (no need to set close_fds)
msg302493 - (view) Author: Eryk Sun (eryksun) * Date: 2017-09-18 22:35
> when I said "also with close_fds=True", I meant that I tried 
> WITHOUT overriding stdin, stdout, and stderr AND setting 
> close_fds=True, but it didn't work. 

Console applications (e.g. python.exe) duplicate their standard handles into a child console process when the bInheritHandles argument of CreateProcess is false, i.e. when subprocess.Popen is called with the default close_fds=True. We know it's duplication instead of inheritance because the actual handle values change and also because it doesn't matter whether or not the standard handles are flagged as inheritable in the creating process. In the typical case this ensures that console processes have valid standard handles. 

However, if you make the standard handles non-inheritable in the parent and also inherit handles when creating the child, then the parent's standard handles will be neither inherited nor duplicated. The child will still inherit the standard handle *values* from the parent (i.e. copied from the parent's PEB ProcessParameters). These handle values will be invalid. Either the handle won't be defined, or it will reference an arbitrary, unrelated kernel object.

> What worked was not overriding stdin/out/err and adding
>
> os.set_inheritable(0, False)
> os.set_inheritable(1, False)
> os.set_inheritable(2, False)
>
> before the call (no need to set close_fds)

In this case cmd.exe gets duplicated standard handles, but you've flagged these handles as non-inheritable. When cmd.exe calls CreateProcess to execute waitfor.exe, it uses bInheritHandles=TRUE. Consequently, its non-inheritable standard handles are neither inherited by nor duplicated to waitfor.exe. Let's attach a debugger to the waitfor.exe process and examine the standard handle values in its PEB ProcessParameters:

    0:003> ?? @$peb->ProcessParameters->StandardInput
    void * 0x00000000`00000008
    0:003> !handle 8
    Handle 8
      Type          Event

    0:003> ?? @$peb->ProcessParameters->StandardOutput
    void * 0x00000000`0000000c
    0:003> !handle c
    Handle c
      Type          WaitCompletionPacket

    0:003> ?? @$peb->ProcessParameters->StandardError
    void * 0x00000000`00000010
    0:003> !handle 10
    Handle 10
      Type          IoCompletion

waitfor.exe doesn't care that its standard handle values reference non-File objects, i.e. an Event, a WaitCompletionPacket, and an IoCompletion port. However, what if by coincidence one of its standard handle values is actually a valid File handle that's inherited for some other reason? That would be a right mess. That's why I suggested overriding the standard handles to the NUL device. For example:

    import subprocess

    script = r'''
    import os
    import subprocess

    os.set_inheritable(0, False)
    os.set_inheritable(1, False)
    os.set_inheritable(2, False)

    cmdline = 'cmd.exe /s /c waitfor.exe signal /t 200'
    subprocess.run(cmdline, timeout=4, stdin=subprocess.DEVNULL,
        stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
    '''

    args = ['python.exe', '-c', script]
    proc = subprocess.Popen(args, encoding='ansi', stdin=subprocess.PIPE,
                stdout=subprocess.PIPE, stderr=subprocess.PIPE)

    proc_out, proc_err = proc.communicate(timeout=20)
    print('proc_out:', proc_out)
    print('proc_err:', proc_err)
History
Date User Action Args
2017-09-18 22:35:52eryksunsetmessages: + msg302493
2017-09-18 13:16:55Leonardo Francalancisetmessages: + msg302450
2017-09-15 18:03:38eryksunsetmessages: + msg302281
2017-09-15 13:50:12Leonardo Francalancisetmessages: + msg302254
2017-09-15 07:32:50Leonardo Francalancisetmessages: + msg302240
2017-09-14 07:53:03eryksunsetmessages: + msg302155
2017-09-14 06:42:06Leonardo Francalancisetmessages: + msg302146
2017-09-13 23:11:33eryksunsetmessages: + msg302130
versions: + Python 3.7
2017-09-13 21:34:15martin.pantersetmessages: + msg302121
2017-09-13 13:24:20r.david.murraysetnosy: + r.david.murray
2017-09-13 12:58:38Leonardo Francalancisetmessages: + msg302071
2017-09-13 12:52:45Leonardo Francalancisetmessages: + msg302068
2017-09-13 12:41:00martin.pantersetnosy: + martin.panter
messages: + msg302067
2017-09-13 12:23:16Leonardo Francalancisetmessages: + msg302062
2017-09-13 11:24:43eryksunsetnosy: + eryksun
messages: + msg302059
2017-09-13 10:12:15Leonardo Francalancisettype: behavior
components: + Interpreter Core
2017-09-13 10:11:27Leonardo Francalancicreate