Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subprocess.run timeout does not function if shell=True and capture_output=True #81605

Closed
haizaar mannequin opened this issue Jun 27, 2019 · 14 comments
Closed

subprocess.run timeout does not function if shell=True and capture_output=True #81605

haizaar mannequin opened this issue Jun 27, 2019 · 14 comments
Assignees
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@haizaar
Copy link
Mannequin

haizaar mannequin commented Jun 27, 2019

BPO 37424
Nosy @Yhg1s, @gpshead, @vstinner, @vadmium, @cappy2112@gmail.com, @miss-islington, @tirkarthi, @haizaar
PRs
  • bpo-37424: Avoid a hang in subprocess.run timeout output capture #14490
  • [3.8] bpo-37424: Avoid a hang in subprocess.run timeout output capture (GH-14490) #15897
  • [3.7] bpo-37424: Avoid a hang in subprocess.run timeout output capture (GH-14490) #15898
  • Superseder
  • bpo-30154: subprocess.run with stderr connected to a pipe won't timeout when killing a never-ending shell commanad
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2019-09-11.14:42:27.911>
    created_at = <Date 2019-06-27.06:59:50.859>
    labels = ['3.7', 'type-bug', 'library']
    title = 'subprocess.run timeout does not function if shell=True and capture_output=True'
    updated_at = <Date 2019-10-01.20:26:06.077>
    user = 'https://github.com/haizaar'

    bugs.python.org fields:

    activity = <Date 2019-10-01.20:26:06.077>
    actor = 'vstinner'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2019-09-11.14:42:27.911>
    closer = 'gregory.p.smith'
    components = ['Library (Lib)']
    creation = <Date 2019-06-27.06:59:50.859>
    creator = 'haizaar'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37424
    keywords = ['patch']
    message_count = 14.0
    messages = ['346712', '346714', '346718', '346735', '346948', '351775', '351792', '351793', '351813', '351912', '352677', '353705', '353707', '353710']
    nosy_count = 8.0
    nosy_names = ['twouters', 'gregory.p.smith', 'vstinner', 'martin.panter', 'cappy', 'miss-islington', 'xtreak', 'haizaar']
    pr_nums = ['14490', '15897', '15898']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'commit review'
    status = 'closed'
    superseder = '30154'
    type = 'behavior'
    url = 'https://bugs.python.org/issue37424'
    versions = ['Python 3.7']

    @haizaar
    Copy link
    Mannequin Author

    haizaar mannequin commented Jun 27, 2019

    Consider the following:

    subprocess.run('sleep 10', shell=True, timeout=.1, capture_output=True)

    It should raise after 0.1 seconds, but it does not - it waits 10 seconds till sleep finishes and only then raises "subprocess.TimeoutExpired: Command 'sleep 10' timed out after 0.1 seconds"

    Removing 'capture_output=True' or converting command string to list (and removing shell=True) makes it work.

    I'm using Python 3.7.3 on Ubuntu 16.04. Reproduces on official docker Python 3.7.3 image alpine3.8.

    @haizaar haizaar mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jun 27, 2019
    @tirkarthi
    Copy link
    Member

    On mac this exits immediately with 0.1 seconds as timeout but reproducible on Ubuntu with master.

    @vadmium
    Copy link
    Member

    vadmium commented Jun 27, 2019

    Same thing going on as in bpo-30154. The shell is probably spawning the “sleep” command as a child process (grandchild of Python), and waiting for it to exit. When Python times out, it will kill the shell process, leaving the grandchild as an orphan. The “sleep” process will still be running and probably holds the “stdout” and/or “stderr” pipes open, and Python will wait indefinitely to be sure it has captured all the output to those pipes.

    Also see bpo-26534 proposes APIs to kill a process group rather than the single child process.

    @vadmium vadmium closed this as completed Jun 27, 2019
    @haizaar
    Copy link
    Mannequin Author

    haizaar mannequin commented Jun 27, 2019

    Thanks for looking at it.

    My original code had "tar" running, which is a child of the shell as well... I assume running exec in the shell may help somewhat, but not a cure obviously.

    I'm all for killing the process group. "Run something and get it's about" should be simple enough without requiring a programmer to know all POSIX process semantics.

    @gpshead
    Copy link
    Member

    gpshead commented Jul 1, 2019

    bpo-30154 that I've marked as a duplicate demonstrates this problem without using shell=True. The solution I proposed handles that via the additional small timeout on the cleanup side, but still has the caveat that the grandchild processes keep running unless the caller used start_new_session=True.

    See the PR.

    We cannot reasonably determine when start_new_session=True should be a default behavior. And I worry that doing it when it should not be will cause unexpected new problems with existing code.

    @gpshead gpshead reopened this Jul 4, 2019
    @gpshead gpshead self-assigned this Jul 4, 2019
    @Yhg1s
    Copy link
    Member

    Yhg1s commented Sep 11, 2019

    New changeset 580d278 by T. Wouters (Gregory P. Smith) in branch 'master':
    bpo-37424: Avoid a hang in subprocess.run timeout output capture (GH-14490)
    580d278

    @miss-islington
    Copy link
    Contributor

    New changeset 872c85a by Miss Islington (bot) in branch '3.8':
    bpo-37424: Avoid a hang in subprocess.run timeout output capture (GH-14490)
    872c85a

    @miss-islington
    Copy link
    Contributor

    New changeset 5fe153c by Miss Islington (bot) in branch '3.7':
    bpo-37424: Avoid a hang in subprocess.run timeout output capture (GH-14490)
    5fe153c

    @vstinner
    Copy link
    Member

    On Windows, the following pattern _can_ hang:

    proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
    try:
      return proc.communicate(timeout=1.0)
    except TimeoutExpired:
      proc.kill()
      return proc.communicate() # <== HERE

    Even if the first child process is killed, communicate() waits until the pipe is closed. If the child process spawned a 3rd process before being killed, the second .communicate() calls hangs until the 3rd process exit or close its stdout.

    I'm not sure if subprocess.run() should do anything about this case, but it was at least for your information.

    I'm fighting against this issue in bpo-37531.

    IMHO it's an issue of the Windows implementation of Popen.communicate(): it's implemented with a blocking call to stdout.read() run in a thread. The thread cannot be interrupted in any way and will until complete once stdout is closed.

    Again, if the direct child process spawned other processes, stdout is only closed in the main parent process once all child processes exit or at least closed their stdout.

    Maybe another issue should be opened to avoid blocking with the following pattern on Windows:

    proc.kill()
    proc.communicate()

    @gpshead
    Copy link
    Member

    gpshead commented Sep 11, 2019

    Thanks. I believe this issue is fixed but you've identified follow-on issues. lets follow up on those in their own bugs.

    @gpshead gpshead closed this as completed Sep 11, 2019
    @vstinner
    Copy link
    Member

    On Windows, the following pattern _can_ hang: (...)

    I created bpo-38207 "subprocess: On Windows, Popen.kill() + Popen.communicate() is blocking until all processes using the pipe close the pipe" to track this issue.

    @cappy2112gmailcom
    Copy link
    Mannequin

    cappy2112gmailcom mannequin commented Oct 1, 2019

    I'm still seeing hangs with subprocess.run() in Python 3.7.4
    Unfortunately, it involves talking to an NVME SSD on Linux, so I cannot
    easily submit code to duplicate it.

    @cappy2112gmailcom
    Copy link
    Mannequin

    cappy2112gmailcom mannequin commented Oct 1, 2019

    Using Python 3.7.4, I'm calling subprocess.run() with the following arguments. .run() still hangs even though a timeout is being passed in.

    subprocess.run(cmd_list,                                            stdout=subprocess.PIPE,                                           stderr=subprocess.STDOUT,                                           shell=False,                                           timeout=timeout_val,                                           check=True,                                           universal_newlines=True)

    cmd_list contains the name of the bash script below, which is
    ./rescan.sh

    ------------------------------------------------------------------
    #!/usr/bin/bash

    echo Rescanning system for PCIe devices

    echo "Rescan device"
    echo 1 > /sys/bus/pci/rescan

    sleep 5

    if [ lspci | grep -ic "Non-Volatile memory controller" -gt 0 ]
    then
    echo "Device Detected after Rescan"
    else
    echo "Device NOT detected after Rescan"
    exit 1
    fi

    echo Rescan Done

    This script is scanning for NVME SSDs, so duplicating the issue is not as straightforward as submitting a python script.

    The OS is CentOS 7.

    uname -a shows
    3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

    I know the Kernel is old, but we have a restriction against updating it.

    @vstinner
    Copy link
    Member

    vstinner commented Oct 1, 2019

    I'm still seeing hangs with subprocess.run() in Python 3.7.4

    That's not surprising, the fix has been pushed at 2019-09-11. Python 3.7.5 will include the fix and it will be released soon:
    https://www.python.org/dev/peps/pep-0537/

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants