Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

join method of multiprocessing Pool object hangs if iterable argument of pool.map is empty #56366

Closed
gkcn mannequin opened this issue May 23, 2011 · 9 comments
Closed
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@gkcn
Copy link
Mannequin

gkcn mannequin commented May 23, 2011

BPO 12157
Nosy @terryjreedy, @vstinner, @akheron
Files
  • multi.py: Code to reproduce the bug
  • issue-12157.patch: Remove the MapResult instance from the Pool cache when the iterable passed to map is empty.
  • issue-12157.patch: Don't Try to use any fancy way to check if the join will hang, leave all the job to faulthandler.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-06-07.22:18:20.103>
    created_at = <Date 2011-05-23.11:42:45.563>
    labels = ['type-bug', 'library']
    title = 'join method of multiprocessing Pool object hangs if iterable argument of pool.map is empty'
    updated_at = <Date 2012-06-07.22:18:20.102>
    user = 'https://bugs.python.org/gkcn'

    bugs.python.org fields:

    activity = <Date 2012-06-07.22:18:20.102>
    actor = 'sbt'
    assignee = 'none'
    closed = True
    closed_date = <Date 2012-06-07.22:18:20.103>
    closer = 'sbt'
    components = ['Library (Lib)']
    creation = <Date 2011-05-23.11:42:45.563>
    creator = 'gkcn'
    dependencies = []
    files = ['22077', '22466', '22475']
    hgrepos = []
    issue_num = 12157
    keywords = ['patch']
    message_count = 9.0
    messages = ['136613', '137154', '137157', '139064', '139078', '139101', '139125', '139128', '162493']
    nosy_count = 10.0
    nosy_names = ['terry.reedy', 'vstinner', 'jnoller', 'neologix', 'rosslagerwall', 'python-dev', 'sbt', 'gkcn', 'mouad', 'petri.lehtinen']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue12157'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3']

    @gkcn
    Copy link
    Mannequin Author

    gkcn mannequin commented May 23, 2011

    When I use map method Pool object with an empty list parameter and then call close and wait methods, join() method hangs. I think this is not intended.

    Code to reproduce the bug is attached.

    PS: A similar issue (using map method with an empty list argument) is reported here[1], but it was about the chunksize parameter and it's resolved.

    [1] http://bugs.python.org/issue6433

    @gkcn gkcn mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 23, 2011
    @terryjreedy
    Copy link
    Member

    I ran with 3.2, winxp with "if __name__ == '__main__':" added after the def statement (without this, process spawned 150 processes before I got logged out) and ()s added to prints. Hung on pool.join as OP said. I could only stop by closing command window as ^C was ignored. Any new test should have a timeout ;-).

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented May 28, 2011

    When map is called, a MapResult object is created, which adds itself to the Pool's result cache.
    When the pool is shut down, the result handler thread waits until the cache drains (while cache and thread._state != TERMINATE). But since no result is posted to the result queue (since the iterable is empty), the result handler never receives any task, and never gets to drain the cache. It thus waits forever on the recv on the result queue.

    @mouad
    Copy link
    Mannequin

    mouad mannequin commented Jun 25, 2011

    Hello,

    This is my first patch to cpython, hope it will be accepted :)

    The fix that i did is to remove the ResultMap instance from the pool cache when the iterable is empty.

    In general here is what happen: The "map" method create a MapResult instance, which add it self automatically to the pool._cache and this ResultMap instance will be used by the task that will be created and added after in the "pool._taskqueue" to communicate the task result, but in case of an empty iterable the task will not be created and we will end up with a MapResult with no task and when we will try to join the pool, it will hang waiting for the task to set the result in the MapResult instance.

    For the test i created a new helper operation_timeout that is used as a contextmanager to make sure that the test will not hang for ever, i don't know if it's useful maybe just running the test without checking for any timeout is more realistic.

    @mouad
    Copy link
    Mannequin

    mouad mannequin commented Jun 25, 2011

    The test case use a helper function in test/support.py that i have proposed in issue bpo-12410.

    I'm dropping this comment here because i don't have the rights to edit the issue dependency.

    cheers;

    @mouad
    Copy link
    Mannequin

    mouad mannequin commented Jun 25, 2011

    Here is a new patch that in the opposite of the first one, it don't try to check if the pool.join() will hang or no, after a discussion with neologix in issue bpo-12410 .

    @terryjreedy
    Copy link
    Member

    The patch to the multiprocessing code is trivial:
    + del cache[self._job]

    The difference in tests is
    + with test.support.operation_timeout(5):
    + p.join()
    versus
    + p.join()

    Victor, do you agree with the simpler method, depending on faulthandler to catch a hang in the test and fail it? Or is the explicit timeout better?

    @vstinner
    Copy link
    Member

    Don't Try to use any fancy way to check if the join will hang,
    leave all the job to faulthandler.

    Victor, do you agree with the simpler method, depending
    on faulthandler to catch a hang in the test and fail it?
    Or is the explicit timeout better?

    If the patch fixes the hang, there is no good reason to write code to handle a new hang.

    We have generic "watchdogs":

    • buildbot timeout (any Python version)
    • regrtest timeout implemented using faulthandler (only in Python 3.x)

    If you run directly the .py test file on a command line, you can still use CTRL+c or CTRL+z to interrupt / stop the process.

    You may want to improve these generic watchdogs, but write a specific watchdog for one specific test function looks useless to me.

    Remember that timeouts are not reliable: we have sometimes false failures because of very slow buildbots... For regrtest timeout, I tried 10, 15, 20 and 30 minutes before choosing a timeout of 60 minutes. For lower values, we have many false failures.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jun 7, 2012

    New changeset 1b3d4ffcb4d1 by Richard Oudkerk in branch '3.2':
    Issue bpo-12157: pool.map() does not handle empty iterable correctly
    http://hg.python.org/cpython/rev/1b3d4ffcb4d1

    New changeset 3585cb1388f2 by Richard Oudkerk in branch 'default':
    Merge fixes for bpo-13854 and bpo-12157.
    http://hg.python.org/cpython/rev/3585cb1388f2

    New changeset 7ab7836894c4 by Richard Oudkerk in branch '2.7':
    Issue bpo-12157: pool.map() does not handle empty iterable correctly
    http://hg.python.org/cpython/rev/7ab7836894c4

    @sbt sbt mannequin closed this as completed Jun 7, 2012
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants