proc communicate not exiting on python subprocess timeout using PIPES #75628

LeonardoFrancalanci · 2017-09-13T10:11:28Z

BPO	31447
Nosy	@pfmoore, @tjguk, @bitdancer, @vadmium, @zware, @eryksun, @zooba

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2017-09-13.10:11:27.791>
labels = ['type-bug', '3.8', '3.9', '3.10', 'library', 'docs']
title = 'proc communicate not exiting on python subprocess timeout using PIPES'
updated_at = <Date 2021-03-08.19:51:26.184>
user = 'https://bugs.python.org/LeonardoFrancalanci'

bugs.python.org fields:

activity = <Date 2021-03-08.19:51:26.184>
actor = 'vstinner'
assignee = 'docs@python'
closed = False
closed_date = None
closer = None
components = ['Documentation', 'Library (Lib)']
creation = <Date 2017-09-13.10:11:27.791>
creator = 'Leonardo Francalanci'
dependencies = []
files = []
hgrepos = []
issue_num = 31447
keywords = []
message_count = 17.0
messages = ['302051', '302059', '302062', '302067', '302068', '302071', '302121', '302130', '302146', '302155', '302240', '302254', '302281', '302450', '302493', '361566', '387860']
nosy_count = 9.0
nosy_names = ['paul.moore', 'tim.golden', 'r.david.murray', 'docs@python', 'martin.panter', 'zach.ware', 'eryksun', 'steve.dower', 'Leonardo Francalanci']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue31447'
versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

LeonardoFrancalanci · 2017-09-13T10:11:28Z

the script below (a python process is called, which calls a waitfor cmd with a timeout of 4 seconds) is supposed to end after 4 seconds. But instead proc.communicate stops after the 20 seconds timeout.

Everything works 100% ok if I remove the stdin/stderr/stdout parameters...

if __name__ == "__main__":
  #start a python process which will wait for 4 seconds and then exit (waitfor is set to 200):
  proc_str = ["C:\\Program Files (x86)\\Anaconda3\\Python.exe",
                        "-c", "import 
  subprocess;subprocess.run('cmd /S /C waitfor g /t 200', shell=False, timeout=4)"]
  proc = subprocess.Popen(proc_str,
                            stdin=subprocess.PIPE,
                            stderr=subprocess.PIPE,
                            stdout=subprocess.PIPE,
                            shell=False,
                            universal_newlines=True)
  #this should exit in 4 seconds (when the called process exits), but instead exits after 20 seconds:
  (proc_out, proc_err) = proc.communicate(timeout=20)

eryksun · 2017-09-13T11:24:44Z

I don't see a bug here. The inner run() kills cmd.exe after 4 seconds. But the waitfor.exe process waits for 200 seconds and has handles for the pipes. The OS won't close out the pipes as long as there are potential writers, so the outer communicate() waits the full 20 seconds.

LeonardoFrancalanci · 2017-09-13T12:23:16Z

The called "C:\\Program Files (x86)\\Anaconda3\\Python.exe" process exits after 4 seconds. The reason why it ends shouldn't matter, right? I expect that a call to communicate should exit as soon as the called process is not running anymore. I don't hold a pipe with "waitfor", I hold it with python.exe.

vadmium · 2017-09-13T12:41:01Z

I’m not familiar with Windows commands or processes enough, but going by Erik’s comment, this sounds similar to bpo-26534 and bpo-30154. Doesn’t your “waitfor” command inherit the same parent process’s stdout etc pipes?

LeonardoFrancalanci · 2017-09-13T12:52:46Z

thank you for your replies!

I used waitfor because it's the simplest and shortest way to have reproducible code. The issue I'm having is obviously not with waitfor, but with another exe, but this doesn't change the fact that I need a way to exit as soon as the called process exits or the timeout ends. I don't care what the called process does. I know that the process exited, and the docs for proc.communicate say "Wait for process to terminate". Well, the process terminates, but "communicate" doesn't exit. It doesn't say "communicate will hang as long as the pipes are open".
Plus: the "python.exe" process doesn't actually timeout!!! It just exits (with exception, but that's not important), and I would like to get the error from stderr! While I can't get it because "communicate" raises an exception...

LeonardoFrancalanci · 2017-09-13T12:58:39Z

(forgot: bpo-26534 is about shell=True, I use shell=False, right?)

vadmium · 2017-09-13T21:34:16Z

.
Isn’t your use of “cmd” similar enough to shell=True? I.e. isn’t that a “cmd” parent process spawning a “waitfor” child? If your 4 s “subprocess.run” call times out, does it kill the “waitfor” process, or leave it running?

Could the “waitfor” process write to stdout or stderr, and should that be captured by your 20 s “communicate” call?

eryksun · 2017-09-13T23:11:34Z

the process terminates, but "communicate" doesn't exit. It doesn't
say "communicate will hang as long as the pipes are open".

I think the problem with leaked handles does warrant a note. It can be a difficult problem to diagnose otherwise.

If communicate() raises subprocess.TimeoutExpired, it can be called again after the caller takes whatever action is necessary to resolve the problem, such as killing a process group as proposed by bpo-5115.

the "python.exe" process doesn't actually timeout!!! It just exits
(with exception, but that's not important), and I would like to get
the error from stderr! While I can't get it because "communicate"
raises an exception...

subprocess.run() in the child does time out and raise subprocess.TimeoutExpired. For the parent to read stdout and stderr to EOF, all pipe handles with write access have to close. Calling communicate() again will succeed once waitfor.exe either exits or gets terminated.

Since python.exe and waitfor.exe are console processes, and waitfor.exe inherits Python's console, you can send it a Ctrl+Break event to make it exit. Make sure to create a new process group in the Popen() call by passing the argument creationflags=subprocess.CREATE_NEW_PROCESS_GROUP. Then to deal with a timeout, call proc.send_signal(signal.CTRL_BREAK_EVENT) and retry communicate().

(Yes, it's wrong that send_signal sends Ctrl+Break to a process group instead of explicitly requiring a group argument to enable this, as proposed in bpo-5115. I guess whoever implemented os.kill for Windows didn't understand that GenerateConsoleCtrlEvent should have been used to implement os.killpg instead.)

I don't care what the called process does.

If you have source control of the child process and don't care about the output of grandchild processes, redirect their standard I/O to the NUL device. Also, make the child's standard I/O files non-inheritable, or if you're using C/C++, call CreateProcess with a PROC_THREAD_ATTRIBUTE_HANDLE_LIST to control handle inheritance (this should be available in Python 3.7, pending bpo-19764).

LeonardoFrancalanci · 2017-09-14T06:42:06Z

I have a workaround, and I guess this means there's a bug in the current implementation of stdout/stderr=subprocess.PIPE; if I open my own pipes instead of using subprocess.PIPE everything seems to work (provided I close the pipe before reading from it):

(errPipeR, errPipeW) = os.pipe();
(outPipeR, outPipeW) = os.pipe();

proc = subprocess.Popen(proc_str,
                                    stdin=subprocess.PIPE,
                                    stderr=errPipeW,
                                    stdout=outPipeW,shell=False,
                                    universal_newlines=True)

proc.communicate(timeout=20)
os.close(outPipeW)
os.close(errPipeW)

Now I can read from the "R" pipes with 0 problems (getting the error and output streams), and proc.communicate exits as soon as the called process exits (4 seconds).

So, since I can use my own pipes without any problems, I don't see why the regular "subprocess.PIPE" shouldn't be working...

eryksun · 2017-09-14T07:53:04Z

Now I can read from the "R" pipes with 0 problems (getting the error
and output streams)

The pipe handles are inherited by waitfor.exe, just as before. You'll find that reading to EOF will block just as it does when using subprocess.PIPE with communicate(). For example:

    stderr = open(errPipeR, 'r')
    err = stderr.read()

The above stderr.read() will return when waitfor.exe either exits or gets terminated.

and proc.communicate exits as soon as the called process exits
(4 seconds).

proc.stdout and proc.stderr are None in this case, so all communicate() can do is write to proc.stdin, if there's input text, and then wait() for the process to exit.

LeonardoFrancalanci · 2017-09-15T07:32:50Z

I'm really sorry, you are 100% correct: it blocks on the pipe (my tests killed the process tree before reading from the pipes).
Still, I think there should be a way to actually read the output also in this case... works for me when I kill the whole process stack.

LeonardoFrancalanci · 2017-09-15T13:50:13Z

can I at least change the call to:

subprocess.run('cmd /S /C waitfor g /t 200', shell=False, timeout=4)

in any way to avoid the problem?
I tried with stdin=subprocess.DEVNULL,stdout=subprocess.DEVNULL,stderr=subprocess.DEVNULL; also with close_fds=True, but nothing changes...

eryksun · 2017-09-15T18:03:39Z

I tried with stdin=subprocess.DEVNULL, stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL

As I said previously, you also need to make the current standard handles non-inheritable. Pending bpo-19764, in 3.7 you'll be able to override stdin, stdout, and stderr with the default close_fds=True. Currently overriding them implicitly sets close_fds=False, in which case you need to manually ensure that the current standard handles (i.e. the pipe handles) can't be inherited. For example:

    os.set_inheritable(0, False)
    os.set_inheritable(1, False)
    os.set_inheritable(2, False)

LeonardoFrancalanci · 2017-09-18T13:16:55Z

That works!
But when I said "also with close_fds=True", I meant that I tried WITHOUT overriding stdin, stdout, and stderr AND setting close_fds=True, but it didn't work. What worked was not overriding stdin/out/err and adding

os.set_inheritable(0, False)
os.set_inheritable(1, False)
os.set_inheritable(2, False)

before the call (no need to set close_fds)

eryksun · 2017-09-18T22:35:52Z

when I said "also with close_fds=True", I meant that I tried
WITHOUT overriding stdin, stdout, and stderr AND setting
close_fds=True, but it didn't work.

Console applications (e.g. python.exe) duplicate their standard handles into a child console process when the bInheritHandles argument of CreateProcess is false, i.e. when subprocess.Popen is called with the default close_fds=True. We know it's duplication instead of inheritance because the actual handle values change and also because it doesn't matter whether or not the standard handles are flagged as inheritable in the creating process. In the typical case this ensures that console processes have valid standard handles.

However, if you make the standard handles non-inheritable in the parent and also inherit handles when creating the child, then the parent's standard handles will be neither inherited nor duplicated. The child will still inherit the standard handle *values* from the parent (i.e. copied from the parent's PEB ProcessParameters). These handle values will be invalid. Either the handle won't be defined, or it will reference an arbitrary, unrelated kernel object.

What worked was not overriding stdin/out/err and adding

os.set_inheritable(0, False)
os.set_inheritable(1, False)
os.set_inheritable(2, False)

before the call (no need to set close_fds)

In this case cmd.exe gets duplicated standard handles, but you've flagged these handles as non-inheritable. When cmd.exe calls CreateProcess to execute waitfor.exe, it uses bInheritHandles=TRUE. Consequently, its non-inheritable standard handles are neither inherited by nor duplicated to waitfor.exe. Let's attach a debugger to the waitfor.exe process and examine the standard handle values in its PEB ProcessParameters:

0:003> ?? @$peb->ProcessParameters->StandardInput
void * 0x00000000`00000008
0:003> !handle 8
Handle 8
  Type          Event

0:003> ?? @$peb->ProcessParameters->StandardOutput
void * 0x00000000`0000000c
0:003> !handle c
Handle c
  Type          WaitCompletionPacket

0:003> ?? @$peb->ProcessParameters->StandardError
void * 0x00000000`00000010
0:003> !handle 10
Handle 10
  Type          IoCompletion

waitfor.exe doesn't care that its standard handle values reference non-File objects, i.e. an Event, a WaitCompletionPacket, and an IoCompletion port. However, what if by coincidence one of its standard handle values is actually a valid File handle that's inherited for some other reason? That would be a right mess. That's why I suggested overriding the standard handles to the NUL device. For example:

    import subprocess

    script = r'''
    import os
    import subprocess

    os.set_inheritable(0, False)
    os.set_inheritable(1, False)
    os.set_inheritable(2, False)

    cmdline = 'cmd.exe /s /c waitfor.exe signal /t 200'
    subprocess.run(cmdline, timeout=4, stdin=subprocess.DEVNULL,
        stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
    '''

    args = ['python.exe', '-c', script]
    proc = subprocess.Popen(args, encoding='ansi', stdin=subprocess.PIPE,
                stdout=subprocess.PIPE, stderr=subprocess.PIPE)

proc_out, proc_err = proc.communicate(timeout=20)
print('proc_out:', proc_out)
print('proc_err:', proc_err)

vstinner · 2020-02-07T10:49:12Z

See also bpo-38207: "subprocess: Popen.kill() + Popen.communicate() is blocking until all processes using the pipe close the pipe".

eryksun · 2021-03-01T12:38:35Z

I'm changing this to a documentation issue and removing the Windows component. The documentation doesn't make it clear that communicate() may block indefinitely (in both POSIX and Windows) even after the process has terminated. As currently implemented, this claim and the example need to be corrected.

Depending on one's needs, the Popen() instance can be used as context manager to ensure that communication is finalized (e.g. in Windows, the synchronous reads in worker threads need to be canceled) and that the pipes are closed -- presuming you don't need to read more data after the timeout. If bpo-43346 is resolved as suggested, then the following will work without blocking indefinitely in the second communicate() call:

    proc = subprocess.Popen(...)
    try:
        out, err = proc.communicate(timeout=15)
    except subprocess.TimeoutExpired:
        with proc:
            proc.kill()
        out, err = proc.communicate()

LeonardoFrancalanci mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Sep 13, 2017

eryksun added the 3.7 (EOL) end of life label Sep 13, 2017

eryksun added stdlib Python modules in the Lib dir OS-windows 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.7 (EOL) end of life labels Feb 27, 2021

eryksun added docs Documentation in the Doc dir and removed OS-windows labels Mar 1, 2021

eryksun assigned docspython Mar 1, 2021

ezio-melotti transferred this issue from another repository Apr 10, 2022

ni-sujain mentioned this issue Mar 30, 2023

fix hang problem with python subprocess run while running tests ni/grpc-labview#239

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proc communicate not exiting on python subprocess timeout using PIPES #75628

proc communicate not exiting on python subprocess timeout using PIPES #75628

LeonardoFrancalanci mannequin commented Sep 13, 2017

LeonardoFrancalanci mannequin commented Sep 13, 2017

eryksun commented Sep 13, 2017

LeonardoFrancalanci mannequin commented Sep 13, 2017

vadmium commented Sep 13, 2017

LeonardoFrancalanci mannequin commented Sep 13, 2017

LeonardoFrancalanci mannequin commented Sep 13, 2017

vadmium commented Sep 13, 2017

eryksun commented Sep 13, 2017

LeonardoFrancalanci mannequin commented Sep 14, 2017

eryksun commented Sep 14, 2017

LeonardoFrancalanci mannequin commented Sep 15, 2017

LeonardoFrancalanci mannequin commented Sep 15, 2017

eryksun commented Sep 15, 2017

LeonardoFrancalanci mannequin commented Sep 18, 2017

eryksun commented Sep 18, 2017

vstinner commented Feb 7, 2020

eryksun commented Mar 1, 2021

proc communicate not exiting on python subprocess timeout using PIPES #75628

proc communicate not exiting on python subprocess timeout using PIPES #75628

Comments

LeonardoFrancalanci mannequin commented Sep 13, 2017

LeonardoFrancalanci mannequin commented Sep 13, 2017

eryksun commented Sep 13, 2017

LeonardoFrancalanci mannequin commented Sep 13, 2017

vadmium commented Sep 13, 2017

LeonardoFrancalanci mannequin commented Sep 13, 2017

LeonardoFrancalanci mannequin commented Sep 13, 2017

vadmium commented Sep 13, 2017

eryksun commented Sep 13, 2017

LeonardoFrancalanci mannequin commented Sep 14, 2017

eryksun commented Sep 14, 2017

LeonardoFrancalanci mannequin commented Sep 15, 2017

LeonardoFrancalanci mannequin commented Sep 15, 2017

eryksun commented Sep 15, 2017

LeonardoFrancalanci mannequin commented Sep 18, 2017

eryksun commented Sep 18, 2017

vstinner commented Feb 7, 2020

eryksun commented Mar 1, 2021