Author eryksun
Recipients eryksun, paul.moore, steve.dower, tim.golden, zach.ware
Date 2021-02-28.22:53:59
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1614552841.1.0.31217184811.issue43346@roundup.psfhosted.org>
In-reply-to
Content
subprocess.run() handles TimeoutExpired by terminating the process and waiting on it. In POSIX, the exception object contains the partially read stdout and stderr bytes. For example:

    cmd = 'echo spam; echo eggs >&2; sleep 2'
    try: p = subprocess.run(cmd, shell=True, capture_output=True,
                            text=True, timeout=1)
    except subprocess.TimeoutExpired as e: ex = e
     
    >>> ex.stdout, ex.stderr
    (b'spam\n', b'eggs\n')

In Windows, subprocess.run() has to finish reading output with a second communicate() call, after which it manually sets the exception's stdout and stderr attributes.

The poses the problem that the second communicate() call may block indefinitely, even though the child process has terminated.

The primary issue is that the pipe handles may be inherited by one or more descendant processes (e.g. via shell=True), which are all regarded as potential writers that keep the pipe from closing. Reading from an open pipe that's empty will block until data becomes available. This is generally desirable for efficiency, compared to polling in a loop. But in this case, the downside is that run() in Windows will effectively ignore the given timeout.

Another problem is that _communicate() writes the input to stdin on the calling thread with a single write() call. If the input exceeds the pipe capacity (4 KiB by default -- but a pipesize 'suggested' size could be supported), the write will block until the child process reads the excess data. This could block indefinitely, which will effectively ignore a given timeout. The POSIX implementation, in contrast, correctly handles a timeout in this case.

Another problem is that Popen.__exit__() closes the stdout, stderr, and stdin files without regard to the _communicate() worker threads. This may seem innocuous, but if a worker thread is blocked on synchronous I/O with one of these files, WinAPI CloseHandle() will also block if it's closing the last handle for the file in the current process. (In this case, the kernel I/O manager has a close procedure that waits to acquire the file for the current thread before performing various housekeeping operations, primarily in the filesystem, such as clearing byte-range locks set by the current process.) A blocked close() is easy to demonstrate. For example:

    args = 'python -c "import time; time.sleep(99)"'
    p = subprocess.Popen(args, shell=True, stdout=subprocess.PIPE)
    try: p.communicate(timeout=1)
    except: pass

    p.kill() # terminates the shell process -- not python.exe
    with p: pass # stdout.close() blocks until python.exe exits

I think the Windows implementation of Popen._communicate() needs to be redesigned as follows: 

    * read in chunks, with a size from 1 byte up to the maximum available,
      as determined by _winapi.PeekNamedPipe()
    * write to the child's stdin on a separate thread
    * after communicate() has started, ensure that synchronous I/O in worker
      threads has been canceled via CancelSynchronousIo() before closing 
      the pipes.

The _winapi module would need to wrap OpenThread() and CancelSynchronousIo(), plus define the TERMINATE_THREAD (0x0001) access right.

With the proposed changes, subprocess.run() would no longer special case TimeoutExpired for Windows.
History
Date User Action Args
2021-02-28 22:54:01eryksunsetrecipients: + eryksun, paul.moore, tim.golden, zach.ware, steve.dower
2021-02-28 22:54:01eryksunsetmessageid: <1614552841.1.0.31217184811.issue43346@roundup.psfhosted.org>
2021-02-28 22:54:00eryksunlinkissue43346 messages
2021-02-28 22:53:59eryksuncreate