classification
Title: slow subprocess.Popen(..., close_fds=True)
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: Kirill Kolyshkin, benjamin.peterson, josh.r
Priority: normal Keywords:

Created on 2019-01-17 04:29 by Kirill Kolyshkin, last changed 2019-01-18 04:56 by benjamin.peterson. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 11584 closed Kirill Kolyshkin, 2019-01-17 04:29
Messages (3)
msg333819 - (view) Author: Kirill Kolyshkin (Kirill Kolyshkin) * Date: 2019-01-17 04:29
In case close_fds=True is passed to subprocess.Popen()
or its users (subprocess.call() etc), it might spend
some considerable time closing non-opened file descriptors,
as demonstrated by the following snippet from strace:

    close(3)                    = -1 EBADF (Bad file descriptor)
    close(5)                    = -1 EBADF (Bad file descriptor)
    close(6)                    = -1 EBADF (Bad file descriptor)
    close(7)                    = -1 EBADF (Bad file descriptor)
    ...
    close(1021)                 = -1 EBADF (Bad file descriptor)
    close(1022)                 = -1 EBADF (Bad file descriptor)
    close(1023)                 = -1 EBADF (Bad file descriptor)

This happens because the code in _close_fds() iterates from 3 up to
MAX_FDS = os.sysconf("SC_OPEN_MAX").

Now, syscalls are cheap, but SC_OPEN_MAX (also known as RLIMIT_NOFILE
or ulimit -n) can be quite high, for example:

    $ docker run --rm python python3 -c \
            $'import os\nprint(os.sysconf("SC_OPEN_MAX"))'
    1048576

This means a million syscalls before spawning a child process, which
can result in a major delay, like 0.1s as measured on my fast and mostly
idling laptop. Here is the comparison with python3 (which does not have this problem):

    $ docker run --rm python python3 -c $'import subprocess\nimport time\ns = time.time()\nsubprocess.check_call([\'/bin/true\'], close_fds=True)\nprint(time.time() - s)\n'
    0.0009245872497558594

    $ docker run --rm python python2 -c $'import subprocess\nimport time\ns = time.time()\nsubprocess.check_call([\'/bin/true\'], close_fds=True)\nprint(time.time() - s)\n'
    0.0964419841766
msg333932 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-01-18 02:24
Others can correct me if I'm wrong, but I'm fairly sure 2.7 isn't making changes unless they fix critical or security-related bugs.

The code here is suboptimal, but it's already been fixed in Python 3 (in #8052), as part of a C accelerator module (that reduces the risk of race conditions and other conflicts your Python level fix entails). Unless someone corrects me, I'll close this as "Won't Fix".
msg333940 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2019-01-18 04:56
Yeah, this is WONTFIX particularly since you can have the Python3 implementation in Python 2 easily with https://pypi.org/project/subprocess32/.
History
Date User Action Args
2019-01-18 04:56:53benjamin.petersonsetstatus: open -> closed

nosy: + benjamin.peterson
messages: + msg333940

resolution: wont fix
stage: resolved
2019-01-18 02:24:17josh.rsetnosy: + josh.r
messages: + msg333932
2019-01-17 04:29:01Kirill Kolyshkincreate