Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProcessPoolExecutor(max_workers=64) crashes on Windows #71090

Closed
diogocp mannequin opened this issue May 1, 2016 · 14 comments
Closed

ProcessPoolExecutor(max_workers=64) crashes on Windows #71090

diogocp mannequin opened this issue May 1, 2016 · 14 comments
Assignees
Labels
build The build process and cross-build type-bug An unexpected behavior, bug, or error

Comments

@diogocp
Copy link
Mannequin

diogocp mannequin commented May 1, 2016

BPO 26903
PRs
  • bpo-26903: Limit ProcessPoolExecutor to 61 workers on Windows #13132
  • [3.7] bpo-26903: Limit ProcessPoolExecutor to 61 workers on Windows (GH-13132) #13206
  • [3.7] bpo-26903: Limit ProcessPoolExecutor to 61 workers on Windows (GH-13132) #13643
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brianquinlan'
    closed_at = <Date 2019-05-09.17:37:35.091>
    created_at = <Date 2016-05-01.20:45:38.281>
    labels = ['type-bug', 'build']
    title = 'ProcessPoolExecutor(max_workers=64) crashes on Windows'
    updated_at = <Date 2021-11-04.13:56:41.517>
    user = 'https://github.com/diogocp'

    bugs.python.org fields:

    activity = <Date 2021-11-04.13:56:41.517>
    actor = 'eryksun'
    assignee = 'bquinlan'
    closed = True
    closed_date = <Date 2019-05-09.17:37:35.091>
    closer = 'bquinlan'
    components = ['Cross-Build']
    creation = <Date 2016-05-01.20:45:38.281>
    creator = 'diogocp'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 26903
    keywords = ['patch']
    message_count = 12.0
    messages = ['264608', '265007', '265086', '265206', '340390', '341545', '341571', '341918', '343858', '365886', '365901', '366314']
    nosy_count = 0.0
    nosy_names = []
    pr_nums = ['13132', '13206', '13643']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue26903'
    versions = ['Python 3.6']

    @diogocp
    Copy link
    Mannequin Author

    diogocp mannequin commented May 1, 2016

    I'm using Python 3.5.1 x86-64 on Windows Server 2008 R2. Trying to run the ProcessPoolExecutor example [1] generates this exception:

    Exception in thread Thread-1:
    Traceback (most recent call last):
      File "C:\Program Files\Python35\lib\threading.py", line 914, in _bootstrap_inner
        self.run()
      File "C:\Program Files\Python35\lib\threading.py", line 862, in run
        self._target(*self._args, **self._kwargs)
      File "C:\Program Files\Python35\lib\concurrent\futures\process.py", line 270, in _queue_management_worker
        ready = wait([reader] + sentinels)
      File "C:\Program Files\Python35\lib\multiprocessing\connection.py", line 859, in wait
        ready_handles = _exhaustive_wait(waithandle_to_obj.keys(), timeout)
      File "C:\Program Files\Python35\lib\multiprocessing\connection.py", line 791, in _exhaustive_wait
        res = _winapi.WaitForMultipleObjects(L, False, timeout)
    ValueError: need at most 63 handles, got a sequence of length 64

    The problem seems to be related to the value of the Windows constant MAXIMUM_WAIT_OBJECTS (see [2]), which is 64. This machine has 64 logical cores, so ProcessPoolExecutor defaults to 64 workers.

    Lowering max_workers to 63 or 62 still results in the same exception, but max_workers=61 works fine.

    [1] https://docs.python.org/3.5/library/concurrent.futures.html#processpoolexecutor-example
    [2] https://hg.python.org/cpython/file/80d1faa9735d/Modules/_winapi.c#l1339

    @diogocp diogocp mannequin added the OS-windows label May 1, 2016
    @terryjreedy
    Copy link
    Member

    The example runs fine, in about 1 second, on my 6 core (which I guess is 12 logical cores) Pentium. I am guessing that the default number of workers needs to be changed, at least on Windows, to min(#logical_cores, 60)

    @tim-one
    Copy link
    Member

    tim-one commented May 7, 2016

    Just noting that the multiprocessing module can be used instead. In the example, add

        import multiprocessing as mp

    and change

            with concurrent.futures.ProcessPoolExecutor() as executor:

    to

            with mp.Pool() as executor:

    That's all it takes. On my 4-core Win10 box (8 logical cores), that continued to work fine even when passing 1024 to mp.Pool() (although it obviously burned time and RAM to create over a thousand processes).

    Some quick Googling strongly suggests there's no reasonably general way to overcome the Windows-defined MAXIMUM_WAIT_OBJECTS=64 for implementations that call the Windows WaitForMultipleObjects().

    @zooba
    Copy link
    Member

    zooba commented May 9, 2016

    Some quick Googling strongly suggests there's no reasonably general way to overcome the Windows-defined MAXIMUM_WAIT_OBJECTS=64 for implementations that call the Windows WaitForMultipleObjects().

    The recommended way to deal with this is to spin up threads to do the wait (which sounds horribly inefficient, but threads on Windows are cheap, especially if they are waiting on kernel objects), and then wait on each thread.

    Personally I think it'd be fine to make the _winapi module do that transparently for WaitForMultipleObjects, as it's complicated to get right (you need to ensure you map back to the original handle, timeouts and cancellation get complicated, there are real race conditions (mainly for auto-reset events), etc.), but in all circumstances it's better than just failing immediately. Handling it within multiprocessing isn't a bad idea, but won't help other users.

    I'd love to write the code to do it, but I doubt I'll get time (especially since I'm missing the PyCon US sprints this year). Happy to help someone else through it. We're going to see Python being used on more and more multicore systems over time, where this will become a genuine issue.

    @zooba zooba added the type-bug An unexpected behavior, bug, or error label May 9, 2016
    @rbtcollins
    Copy link
    Member

    This is now showing up in end user tools like black: psf/black#564

    @rbtcollins rbtcollins added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes labels Apr 17, 2019
    @brianquinlan
    Copy link
    Contributor

    If no one has short-term plans to improve multiprocessing.connection.wait, then I'll update the docs to list this limitation, ensure that ProcessPoolExecutor never defaults to >60 processes on windows and raises a ValueError if the user explicitly passes a larger number.

    @brianquinlan brianquinlan self-assigned this May 6, 2019
    @brianquinlan
    Copy link
    Contributor

    BTW, the 61 process limit comes from:

    63 - <the result queue reader> - <the thread wakeup reader>

    @zooba
    Copy link
    Member

    zooba commented May 8, 2019

    New changeset 3988986 by Steve Dower (Brian Quinlan) in branch 'master':
    bpo-26903: Limit ProcessPoolExecutor to 61 workers on Windows (GH-13132)
    3988986

    @ned-deily
    Copy link
    Member

    New changeset 8ea0fd8 by Ned Deily (Miss Islington (bot)) in branch '3.7':
    bpo-26903: Limit ProcessPoolExecutor to 61 workers on Windows (GH-13132) (GH-13643)
    8ea0fd8

    @ned-deily ned-deily removed the 3.9 only security fixes label May 29, 2019
    @MikeHommey
    Copy link
    Mannequin

    MikeHommey mannequin commented Apr 7, 2020

    This is still a problem in python 3.7 (and, I guess 3.8).

    When not even giving a max_workers, it fails with a ValueError exception on _winapi.WaitForMultipleObjects, with the message "need at most 63 handles, got a sequence of length 63"

    That happens with max_workers=None and max_workers=61 ; not max_workers=60.

    I wonder if there's an off-by-one in this test:

    if (nhandles < 0 || nhandles >= MAXIMUM_WAIT_OBJECTS - 1) {

    @zooba
    Copy link
    Member

    zooba commented Apr 7, 2020

    More likely there's been another change to the events that are listened to by multiprocessing, which didn't update the overall limit.

    File a new bug, please.

    @mingwandroid
    Copy link
    Mannequin

    mingwandroid mannequin commented Apr 13, 2020

    I took the liberty of filing this: https://bugs.python.org/issue40263

    Cheers.

    @ahmedsayeed1982 ahmedsayeed1982 mannequin added build The build process and cross-build and removed OS-windows 3.7 (EOL) end of life 3.8 only security fixes labels Nov 4, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @mboeringa
    Copy link

    mboeringa commented Oct 21, 2023

    @brianquinlan and @ned-deily

    Is this artificial limitation on the number of 'max_workers' that can be used with a Python 'ProcessPoolExecutor' still the valid fix for Windows 11 and Windows Server 2022?

    This issue sounds very much like the Windows "processor group" issues, where software not aware of "processor groups" and not explicitly written for dealing with them, would fail to properly run on systems with more than 64 logical processors. I was personallly bitten by this issue when I upgraded my workstation from 2x 14C/28T processors to 2x 22C/44T processors.

    According to Microsoft documentation, Windows 11 and Server 2022 are supposed to finally fix this issue, and allow all software, even if not specifically written to deal with processor groups and CPU sets, to scale out across all logical processors without modifications.

    Unfortunately, I have not been able to test it, as I cannot upgrade the workstation to W11 due to the older hardware. In fact, I disabled multi-threading and NUMA in the BIOS of my system to avoid any issues under W10 and stay below the 64 logical processors limit (which doesn't affect performance notably in my experience up to now).

    @zooba
    Copy link
    Member

    zooba commented Oct 23, 2023

    The limitation is in WaitForMultipleObjectsEx (more precisely, MAXIMUM_WAIT_OBJECTS). I think we're assuming it's still set to 64, but I haven't heard of any changes to this.

    My PR #107873 would help raise the limit. Just needs some review and testing.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    build The build process and cross-build type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants