Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slow subprocess.Popen(..., close_fds=True) #79938

Closed
kolyshkin mannequin opened this issue Jan 17, 2019 · 3 comments
Closed

slow subprocess.Popen(..., close_fds=True) #79938

kolyshkin mannequin opened this issue Jan 17, 2019 · 3 comments
Labels
performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@kolyshkin
Copy link
Mannequin

kolyshkin mannequin commented Jan 17, 2019

BPO 35757
Nosy @benjaminp, @MojoVampire, @kolyshkin
PRs
  • [2.7] bpo-35757: subprocess.Popen: optimize close_fds for Linux #11584
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-01-18.04:56:53.906>
    created_at = <Date 2019-01-17.04:29:01.005>
    labels = ['library', 'performance']
    title = 'slow subprocess.Popen(..., close_fds=True)'
    updated_at = <Date 2019-01-18.04:56:53.903>
    user = 'https://github.com/kolyshkin'

    bugs.python.org fields:

    activity = <Date 2019-01-18.04:56:53.903>
    actor = 'benjamin.peterson'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-01-18.04:56:53.906>
    closer = 'benjamin.peterson'
    components = ['Library (Lib)']
    creation = <Date 2019-01-17.04:29:01.005>
    creator = 'Kirill Kolyshkin'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 35757
    keywords = []
    message_count = 3.0
    messages = ['333819', '333932', '333940']
    nosy_count = 3.0
    nosy_names = ['benjamin.peterson', 'josh.r', 'Kirill Kolyshkin']
    pr_nums = ['11584']
    priority = 'normal'
    resolution = 'wont fix'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue35757'
    versions = ['Python 2.7']

    @kolyshkin
    Copy link
    Mannequin Author

    kolyshkin mannequin commented Jan 17, 2019

    In case close_fds=True is passed to subprocess.Popen()
    or its users (subprocess.call() etc), it might spend
    some considerable time closing non-opened file descriptors,
    as demonstrated by the following snippet from strace:

    close(3)                    = -1 EBADF (Bad file descriptor)
    close(5)                    = -1 EBADF (Bad file descriptor)
    close(6)                    = -1 EBADF (Bad file descriptor)
    close(7)                    = -1 EBADF (Bad file descriptor)
    ...
    close(1021)                 = -1 EBADF (Bad file descriptor)
    close(1022)                 = -1 EBADF (Bad file descriptor)
    close(1023)                 = -1 EBADF (Bad file descriptor)
    

    This happens because the code in _close_fds() iterates from 3 up to
    MAX_FDS = os.sysconf("SC_OPEN_MAX").

    Now, syscalls are cheap, but SC_OPEN_MAX (also known as RLIMIT_NOFILE
    or ulimit -n) can be quite high, for example:

        $ docker run --rm python python3 -c \
                $'import os\nprint(os.sysconf("SC_OPEN_MAX"))'
        1048576

    This means a million syscalls before spawning a child process, which
    can result in a major delay, like 0.1s as measured on my fast and mostly
    idling laptop. Here is the comparison with python3 (which does not have this problem):

        $ docker run --rm python python3 -c $'import subprocess\nimport time\ns = time.time()\nsubprocess.check_call([\'/bin/true\'], close_fds=True)\nprint(time.time() - s)\n'
        0.0009245872497558594
    
        $ docker run --rm python python2 -c $'import subprocess\nimport time\ns = time.time()\nsubprocess.check_call([\'/bin/true\'], close_fds=True)\nprint(time.time() - s)\n'
        0.0964419841766

    @kolyshkin kolyshkin mannequin added stdlib Python modules in the Lib dir performance Performance or resource usage labels Jan 17, 2019
    @MojoVampire
    Copy link
    Mannequin

    MojoVampire mannequin commented Jan 18, 2019

    Others can correct me if I'm wrong, but I'm fairly sure 2.7 isn't making changes unless they fix critical or security-related bugs.

    The code here is suboptimal, but it's already been fixed in Python 3 (in bpo-8052), as part of a C accelerator module (that reduces the risk of race conditions and other conflicts your Python level fix entails). Unless someone corrects me, I'll close this as "Won't Fix".

    @benjaminp
    Copy link
    Contributor

    Yeah, this is WONTFIX particularly since you can have the Python3 implementation in Python 2 easily with https://pypi.org/project/subprocess32/.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant