Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing.Pool.join() always takes at least 100 ms #79660

Closed
vstinner opened this issue Dec 13, 2018 · 6 comments
Closed

multiprocessing.Pool.join() always takes at least 100 ms #79660

vstinner opened this issue Dec 13, 2018 · 6 comments
Labels
3.8 only security fixes stdlib Python modules in the Lib dir

Comments

@vstinner
Copy link
Member

BPO 35479
Nosy @vstinner
PRs
  • bpo-35479: Optimize multiprocessing.Pool.join() #11136
  • bpo-35379: Check IDLE objects before calling method  #10564
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-09-21.22:26:17.456>
    created_at = <Date 2018-12-13.00:36:54.696>
    labels = ['3.8', 'library']
    title = 'multiprocessing.Pool.join() always takes at least 100 ms'
    updated_at = <Date 2021-09-21.22:26:17.454>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2021-09-21.22:26:17.454>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-09-21.22:26:17.456>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2018-12-13.00:36:54.696>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 35479
    keywords = ['patch']
    message_count = 6.0
    messages = ['331726', '331727', '331794', '331800', '331805', '402401']
    nosy_count = 1.0
    nosy_names = ['vstinner']
    pr_nums = ['11136', '10564']
    priority = 'normal'
    resolution = 'out of date'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue35479'
    versions = ['Python 3.8']

    @vstinner
    Copy link
    Member Author

    The join() method of multiprocessing.Pool calls self._worker_handler.join(): it's a thread running _handle_workers(). The core of this thread function is:

            while thread._state == RUN or (pool._cache and thread._state != TERMINATE):
                pool._maintain_pool()
                time.sleep(0.1)

    I understand that the delay of 100 ms is used to check regularly the stop condition changed. This sleep causes a mandatory delay of 100 ms on Pool.join().

    @vstinner vstinner added 3.8 only security fixes stdlib Python modules in the Lib dir labels Dec 13, 2018
    @vstinner
    Copy link
    Member Author

    Attached PR 11136 modify _worker_handler() loop to wait on threading.Event events, so Pool.join() completes as soon as possible.

    Example:
    ---

    import multiprocessing
    import time
    
    def the_test():
        start_time = time.monotonic()
        pool = multiprocessing.Pool(1)
        res = pool.apply_async(int, ("1",))
        pool.close()
        #pool.terminate()
        pool.join()
        dt = time.monotonic() - start_time
        print("%.3f sec" % dt)
    
    the_test()

    Minimum timing with _handle_results() using:

    • current code (time.sleep(0.1)): min 0.132 sec
    • time.sleep(1.0): min 1.033 sec
    • my PR using events (wait(0.1)): min 0.033 sec

    Currently, join() minimum timing depends on _handle_results() sleep() duration (100 ms).

    With my PR, it completes as soon as possible: when state change and/or when a result is set.

    My PR still requires an hardcoded delay of 100 ms to workaround bpo-35478 bug: results are never set if the pool is terminated.

    @vstinner
    Copy link
    Member Author

    My PR 11136 doesn't work: _maintain_pool() should be called frequently to check when a worker completed. Polling worker exit status seems inefficient :-(

    asyncio uses SIGCHLD signal to be notified when a child process completes. SafeChildWatcher calls os.waitpid(pid, os.WNOHANG) on each child process, whereas FastChildWatcher() uses os.waitpid(-1, os.WNOHANG).

    @vstinner
    Copy link
    Member Author

    My PR 11136 doesn't work: _maintain_pool() should be called frequently to check when a worker completed. Polling worker exit status seems inefficient :-(

    I created bpo-35493: "multiprocessing.Pool._worker_handler(): use SIGCHLD to be notified on worker exit".

    @vstinner
    Copy link
    Member Author

    _worker_handler has two issues:

    • It polls the worker status every status every 100 ms: I created bpo-35493 to investigate how to avoid that
    • After close() or terminate() has been called, it loops until self._cache is empty. I would like to use result.wait(), but a result never completes after terminate(): I created bpo-35478 to see if tasks can be unblocked in that case (to ensure that result.wait() completes after terminate().

    @vstinner
    Copy link
    Member Author

    Nobody managed to find a solution in 3 years. I close the issue.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant