Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: maxtasksperchild for ProcessPoolExecutor #88896

Closed
cool-RR mannequin opened this issue Jul 24, 2021 · 14 comments
Closed

Feature request: maxtasksperchild for ProcessPoolExecutor #88896

cool-RR mannequin opened this issue Jul 24, 2021 · 14 comments
Assignees
Labels
3.11 only security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-feature A feature request or enhancement

Comments

@cool-RR
Copy link
Mannequin

cool-RR mannequin commented Jul 24, 2021

BPO 44733
Nosy @gpshead, @pitrou, @cool-RR, @loganasherjones
PRs
  • bpo-44733: Add maxtasksperchild to ProcessPoolExecutor #27373
  • bpo-44733: max_tasks_per_child + fork not allowed #32187
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = None
    created_at = <Date 2021-07-24.12:14:41.532>
    labels = ['type-feature', 'library', '3.11']
    title = 'Feature request: maxtasksperchild for ProcessPoolExecutor'
    updated_at = <Date 2022-03-31.00:07:58.887>
    user = 'https://github.com/cool-RR'

    bugs.python.org fields:

    activity = <Date 2022-03-31.00:07:58.887>
    actor = 'loganasherjones'
    assignee = 'gregory.p.smith'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2021-07-24.12:14:41.532>
    creator = 'cool-RR'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44733
    keywords = ['patch']
    message_count = 13.0
    messages = ['398143', '398240', '398243', '406689', '411449', '411450', '411452', '411471', '411546', '411547', '416310', '416314', '416406']
    nosy_count = 4.0
    nosy_names = ['gregory.p.smith', 'pitrou', 'cool-RR', 'loganasherjones']
    pr_nums = ['27373', '32187']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue44733'
    versions = ['Python 3.11']

    @cool-RR
    Copy link
    Mannequin Author

    cool-RR mannequin commented Jul 24, 2021

    I love concurrent.futures, and I'd like to use it wherever I can. There's a feature in multiprocessing.Pool that I wish would also be available in ProcessPoolExecutor: The maxtasksperchild argument.

    Documentation: "maxtasksperchild is the number of tasks a worker process can complete before it will exit and be replaced with a fresh worker process, to enable unused resources to be freed. The default maxtasksperchild is None, which means worker processes will live as long as the pool."

    I want to be able to set it to 1, so each process will only execute one task and then be replaced with a fresh process.

    @cool-RR cool-RR mannequin added 3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jul 24, 2021
    @loganasherjones
    Copy link
    Mannequin

    loganasherjones mannequin commented Jul 26, 2021

    I think I have a solution for this, but I'm pretty new to contributing. Still writing up some tests.

    @cool-RR
    Copy link
    Mannequin Author

    cool-RR mannequin commented Jul 26, 2021

    Awesome. You can link to experimental code here. Even if it were not accepted to Python, it could be useful for me and for other people who might see this issue.

    @pitrou
    Copy link
    Member

    pitrou commented Nov 20, 2021

    New changeset fdc0e09 by Logan Jones in branch 'main':
    bpo-44733: Add max_tasks_per_child to ProcessPoolExecutor (GH-27373)
    fdc0e09

    @pitrou pitrou closed this as completed Nov 20, 2021
    @pitrou pitrou closed this as completed Nov 20, 2021
    @gpshead
    Copy link
    Member

    gpshead commented Jan 24, 2022

    bpo-46464 describes a deadlock that the pre-requisite for this feature causes.

    Spawning new children directly via fork() is impossible once a thread in the parent process has been started and concurrent.futures.process starts a thread.

    @gpshead
    Copy link
    Member

    gpshead commented Jan 24, 2022

    If we want this feature to stay in, it needs an implementation that does not rely on a multithreaded processing calling fork(). I'm going to have to revert the existing implementation for bpo-46464.

    @gpshead gpshead reopened this Jan 24, 2022
    @gpshead gpshead reopened this Jan 24, 2022
    @cool-RR
    Copy link
    Mannequin Author

    cool-RR mannequin commented Jan 24, 2022

    Oh that sucks. Logan and Antoine worked on this feature for so long. Thanks for reporting Greg.

    @loganasherjones
    Copy link
    Mannequin

    loganasherjones mannequin commented Jan 24, 2022

    Based on my reading I’m hopeful that this change can make it in quickly once I find the time. The previous implementation didn’t care how the processes were created.

    I will look as soon as I can

    @gpshead
    Copy link
    Member

    gpshead commented Jan 25, 2022

    from discussion on the other bug it looks like we should have a way to keep this; we just need to not allow it when the mp_context to be used is the 'fork' one.

    @gpshead
    Copy link
    Member

    gpshead commented Jan 25, 2022

    As this is a new feature, it would also be reasonable to have specifying max_tasks_per_child without explicitly specifying a mp_context default to a safe mp_context.

    @loganasherjones
    Copy link
    Mannequin

    loganasherjones mannequin commented Mar 30, 2022

    Okay, I'm actually able to work on this again. What is the best way to make this change real. Should I be working off of main?

    @gpshead
    Copy link
    Member

    gpshead commented Mar 30, 2022

    yep, branch off of a recent main.

    @loganasherjones
    Copy link
    Mannequin

    loganasherjones mannequin commented Mar 31, 2022

    Ok I now have a PR up with the features requested. Let me know if you need anything else!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @gpshead
    Copy link
    Member

    gpshead commented Jun 17, 2022

    your PR was #32187 and an equivalent ultimately landed in fa4f0a1 if i'm untangling things correctly. So this can be closed again. max_tasks_per_child made it into 3.11beta1.

    @gpshead gpshead closed this as completed Jun 17, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-feature A feature request or enhancement
    Projects
    Development

    No branches or pull requests

    3 participants