Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing deadlocks when sending large data through Queue with timeout #48039

Closed
DavidDecotigny mannequin opened this issue Sep 5, 2008 · 6 comments
Closed

multiprocessing deadlocks when sending large data through Queue with timeout #48039

DavidDecotigny mannequin opened this issue Sep 5, 2008 · 6 comments
Labels
stdlib Python modules in the Lib dir

Comments

@DavidDecotigny
Copy link
Mannequin

DavidDecotigny mannequin commented Sep 5, 2008

BPO 3789
Files
  • c.py: Example showing the bug ("Happy" never displayed)
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2008-09-06.01:20:34.155>
    created_at = <Date 2008-09-05.22:35:25.855>
    labels = ['invalid', 'library']
    title = 'multiprocessing deadlocks when sending large data through Queue with timeout'
    updated_at = <Date 2008-09-06.01:55:37.976>
    user = 'https://bugs.python.org/DavidDecotigny'

    bugs.python.org fields:

    activity = <Date 2008-09-06.01:55:37.976>
    actor = 'jnoller'
    assignee = 'jnoller'
    closed = True
    closed_date = <Date 2008-09-06.01:20:34.155>
    closer = 'jnoller'
    components = ['Library (Lib)']
    creation = <Date 2008-09-05.22:35:25.855>
    creator = 'DavidDecotigny'
    dependencies = []
    files = ['11401']
    hgrepos = []
    issue_num = 3789
    keywords = []
    message_count = 6.0
    messages = ['72640', '72655', '72657', '72658', '72659', '72660']
    nosy_count = 2.0
    nosy_names = ['jnoller', 'DavidDecotigny']
    pr_nums = []
    priority = 'normal'
    resolution = 'not a bug'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue3789'
    versions = ['Python 2.6']

    @DavidDecotigny
    Copy link
    Mannequin Author

    DavidDecotigny mannequin commented Sep 5, 2008

    With the attached script, then demo() called with for example
    datasize=40*1024*1024 and timeout=1 will deadlock: the program never
    terminates.

    The bug appears on Linux (RHEL4) / intel x86 with "multiprocessing"
    coming with python 2.6b3 and I think it can be easily reproduced on
    other Unices. It also appears with python 2.5 and the standalone
    processing package 0.52
    (https://developer.berlios.de/bugs/?func=detailbug&bug_id=14453&group_id=9001).

    After a quick investigation, it seems to be a deadlock between waitpid
    in the parent process, and a pipe::send in the "_feed" thread of the
    child process. Indeed, the problem seems to be that "_feed" is still
    sending data (the data is laaarge) to the pipe while the parent process
    already called waitpid (because of the "short" timeout): the pipe fills
    up because no consumer is eating the data (consumer already in waitpid)
    and hence the "_feed" thread in the child blocks forever. Since the
    child process does a _feed.join() before exiting (after function f), it
    never exits. And hence the waitpid in the parent process never returns
    because the child never exits.

    This doesn't happen anymore if I use timeout=None or a larger timeout
    (eg. 10 seconds). Because in both cases, waitpid is called /after/ the
    "_feed" thread in the child process could send all of its data through
    the pipe.

    @DavidDecotigny DavidDecotigny mannequin added the stdlib Python modules in the Lib dir label Sep 5, 2008
    @DavidDecotigny
    Copy link
    Mannequin Author

    DavidDecotigny mannequin commented Sep 6, 2008

    A quick fix in the user code, when we are sure we don't need the child
    process if a timeout happens, is to call worker.terminate() in an except
    Empty clause.

    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented Sep 6, 2008

    See http://docs.python.org/dev/library/multiprocessing.html#multiprocessing-
    programming

    Specifically:
    Joining processes that use queues

    Bear in mind that a process that has put items in a queue will wait
    before terminating until all the buffered items are fed by the “feeder”
    thread to the underlying pipe. (The child process can call the
    Queue.cancel_join() method of the queue to avoid this behaviour.)

    This means that whenever you use a queue you need to make sure that all
    items which have been put on the queue will eventually be removed before
    the process is joined. Otherwise you cannot be sure that processes which
    have put items on the queue will terminate. Remember also that non-
    daemonic processes will be automatically be joined.

    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented Sep 6, 2008

    In a later release, I'd like to massage this in such a way that you do not
    have to wait for a child queue to be drained prior to calling join.

    One way to work around this David, is to call Queue.cancel_join_thread():

    def f(datasize, q):
        q.cancel_join_thread()
        q.put(range(datasize))

    @jnoller jnoller mannequin closed this as completed Sep 6, 2008
    @jnoller jnoller mannequin added the invalid label Sep 6, 2008
    @DavidDecotigny
    Copy link
    Mannequin Author

    DavidDecotigny mannequin commented Sep 6, 2008

    Thank you Jesse. When I read this passage, I thought naively that a
    timeout raised in a get() would not be harmful: that somehow the whole
    get() request would be aborted. But now I realize that it would make
    things rather complicated and dangerous: the data would get dropped, and
    will never be recovered by subsequent get().
    So thank you for the hint, and leave the things as they are, it's better.

    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented Sep 6, 2008

    No problem David, you're the 4th person to ask me about this in the past 2
    months :)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    0 participants