Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to set pipe size on subprocess.Popen. #85752

Closed
rhpvorderman mannequin opened this issue Aug 19, 2020 · 5 comments
Closed

Allow to set pipe size on subprocess.Popen. #85752

rhpvorderman mannequin opened this issue Aug 19, 2020 · 5 comments
Assignees
Labels
3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@rhpvorderman
Copy link
Mannequin

rhpvorderman mannequin commented Aug 19, 2020

BPO 41586
Nosy @gpshead, @rhpvorderman
PRs
  • bpo-41586: Add pipesize parameter to subprocess. Add F_GETPIPE_SZ and F_SETPIPE_SZ to fcntl. #21921
  • bpo-41586: Attempt to make the pipesize tests more robust. #22839
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2020-10-21.06:04:04.119>
    created_at = <Date 2020-08-19.05:46:00.673>
    labels = ['type-feature', 'library', '3.10']
    title = 'Allow to set pipe size on subprocess.Popen.'
    updated_at = <Date 2020-10-21.06:04:04.118>
    user = 'https://github.com/rhpvorderman'

    bugs.python.org fields:

    activity = <Date 2020-10-21.06:04:04.118>
    actor = 'gregory.p.smith'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2020-10-21.06:04:04.119>
    closer = 'gregory.p.smith'
    components = ['Library (Lib)']
    creation = <Date 2020-08-19.05:46:00.673>
    creator = 'rhpvorderman'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 41586
    keywords = ['patch']
    message_count = 5.0
    messages = ['375636', '379059', '379060', '379148', '379176']
    nosy_count = 2.0
    nosy_names = ['gregory.p.smith', 'rhpvorderman']
    pr_nums = ['21921', '22839']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'commit review'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue41586'
    versions = ['Python 3.10']

    @rhpvorderman
    Copy link
    Mannequin Author

    rhpvorderman mannequin commented Aug 19, 2020

    Pipes block if reading from an empty pipe or when writing to a full pipe. When this happens the program waiting for the pipe still uses a lot of CPU cycles when waiting for the pipe to stop blocking.

    I found this while working with xopen. A library that pipes data into an external gzip process. (This is more efficient than using python's gzip module, because the subprocess escapes the GIL, so your main algorithm can fully utilize one CPU core while the compression is offloaded to another).

    It turns out that increasing the pipe size on Linux from the default of 64KB to the maximum allowed pipe size in /proc/sys/fs/max-pipe-size (1024KB) drastically improves performance: pycompression/xopen#35. TLDR: full utilization of CPU cores, a 40%+ decrease in wall-clock time and a 20% decrease in the number of compute seconds (indicating that 20% was wasted waiting on blocking pipes).

    However, doing this with subprocess is quite involved as it is now.

    1. You have to find out which constants to use in fcntl for setting the pipesize (these constants are not in python).
    2. You have to start the Popen process with routing stdout to subprocess.Pipe.
    3. You have to get my_popen_process.stdout.fileno()
    4. Use fcntl.fcntl to modify the pipe size.

    It would be much easier to do subprocess.Popen(args, pipesize=1024 *1024) for example.

    I am currently working on a PR implementing this. It will also make F_GETPIPE_SZ and F_SETPIPE_SZ available to the fcntl module.

    @rhpvorderman rhpvorderman mannequin added 3.10 only security fixes stdlib Python modules in the Lib dir labels Aug 19, 2020
    @gpshead
    Copy link
    Member

    gpshead commented Oct 19, 2020

    New changeset 23c0fb8 by Ruben Vorderman in branch 'master':
    bpo-41586: Add pipesize parameter to subprocess & F_GETPIPE_SZ and F_SETPIPE_SZ to fcntl. (GH-21921)
    23c0fb8

    @gpshead
    Copy link
    Member

    gpshead commented Oct 19, 2020

    Thanks Ruben!

    @gpshead gpshead closed this as completed Oct 19, 2020
    @gpshead gpshead added the type-feature A feature request or enhancement label Oct 19, 2020
    @gpshead gpshead closed this as completed Oct 19, 2020
    @gpshead gpshead added the type-feature A feature request or enhancement label Oct 19, 2020
    @gpshead
    Copy link
    Member

    gpshead commented Oct 20, 2020

    this caused a variety of buildbot failures. investigating.

    @gpshead gpshead reopened this Oct 20, 2020
    @gpshead gpshead reopened this Oct 20, 2020
    @gpshead
    Copy link
    Member

    gpshead commented Oct 21, 2020

    New changeset 786addd by Gregory P. Smith in branch 'master':
    bpo-41586: Attempt to make the pipesize tests more robust. (GH-22839)
    786addd

    @gpshead gpshead closed this as completed Oct 21, 2020
    @gpshead gpshead closed this as completed Oct 21, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant