Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing.Queue.join_thread() does nothing if created and use in the same process #75069

Closed
vstinner opened this issue Jul 9, 2017 · 18 comments
Labels
3.7 (EOL) end of life performance Performance or resource usage tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

vstinner commented Jul 9, 2017

BPO 30886
Nosy @pitrou, @vstinner, @applio
PRs
  • bpo-30886: Fix multiprocessing.Queue.join_thread() #2642
  • [3.6] bpo-30886: Fix multiprocessing.Queue.join_thread() (#2642) #2643
  • [3.5] bpo-30886: Fix multiprocessing.Queue.join_thread() (#2642) #2644
  • Files
  • test_handle_called_with_mp_queue-bug.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-07-10.11:49:09.053>
    created_at = <Date 2017-07-09.23:48:51.912>
    labels = ['3.7', 'tests', 'performance']
    title = 'multiprocessing.Queue.join_thread() does nothing if created and use in the same process'
    updated_at = <Date 2017-07-10.21:40:03.742>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2017-07-10.21:40:03.742>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-07-10.11:49:09.053>
    closer = 'vstinner'
    components = ['Tests']
    creation = <Date 2017-07-09.23:48:51.912>
    creator = 'vstinner'
    dependencies = []
    files = ['46999']
    hgrepos = []
    issue_num = 30886
    keywords = ['patch']
    message_count = 18.0
    messages = ['298008', '298009', '298010', '298011', '298014', '298037', '298038', '298040', '298041', '298042', '298043', '298051', '298052', '298054', '298055', '298056', '298057', '298089']
    nosy_count = 3.0
    nosy_names = ['pitrou', 'vstinner', 'davin']
    pr_nums = ['2642', '2643', '2644']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue30886'
    versions = ['Python 3.5', 'Python 3.6', 'Python 3.7']

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 9, 2017

    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.x%20Shared%203.x/builds/557/steps/test/logs/stdio

    test_handle_called_with_mp_queue (test.test_logging.QueueListenerTest) ... Warning -- threading_cleanup() failed to cleanup -1 threads after 3 sec (count: 0, dangling: 1)
    ok

    @vstinner vstinner added 3.7 (EOL) end of life tests Tests in the Lib/test dir performance Performance or resource usage labels Jul 9, 2017
    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 9, 2017

    The load average was 3.15:

    0:04:33 load avg: 3.15 [176/406] test_logging failed (env changed)

    --

    Another fail on AMD64 FreeBSD CURRENT Non-Debug 3.x:

    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Non-Debug%203.x/builds/568/steps/test/logs/stdio

    0:01:56 load avg: 3.45 [ 44/406] test_logging failed (env changed)
    ...
    test_output (test.test_logging.UnixSocketHandlerTest) ... ok
    test_output (test.test_logging.UnixDatagramHandlerTest) ... ok
    test_output (test.test_logging.UnixSysLogHandlerTest) ... ok
    test__all__ (test.test_logging.MiscTestCase) ... ok
    test_handle_called_with_mp_queue (test.test_logging.QueueListenerTest) ... Warning -- threading_cleanup() failed to cleanup -1 threads after 4 sec (count: 0, dangling: 1)
    ok
    test_handle_called_with_queue_queue (test.test_logging.QueueListenerTest) ... ok
    test_no_messages_in_queue_after_stop (test.test_logging.QueueListenerTest) ... ok

    @vstinner
    Copy link
    Member Author

    Previous issue which fixed QueueListenerTest of test_logging is bpo-30131:

    commit 8ca2f2f
    Author: Victor Stinner <victor.stinner@gmail.com>
    Date: Wed Apr 26 15:56:25 2017 +0200

    bpo-30131: test_logging now joins queue threads (bpo-1298)
    
    QueueListenerTest of test_logging now closes the multiprocessing
    Queue and joins its thread to prevent leaking dangling threads to
    following tests.
    
    Add also @support.reap_threads to detect earlier if a test leaks
    threads (and try to "cleanup" these threads).
    

    @vstinner
    Copy link
    Member Author

    While trying to reproduce the bug, I got:

    test_handle_called_with_mp_queue (test.test_logging.QueueListenerTest) ... /usr/home/haypo/cpython/Lib/test/support/init.py:1515: ResourceWarning: unclosed <socket.socket fd=6, family=AddressFamily.AF_INET, type=536870913, proto=0, laddr=('127.0.0.1', 8166), raddr=('127.0.0.1', 8167)>
    gc.collect()
    ok

    @vstinner
    Copy link
    Member Author

    The problem is that multiprocessing.Queue.join_thread() does nothing since the thread wasn't started by a subprocess.

    See also bpo-30171: Emit ResourceWarning in multiprocessing Queue destructor.

    @vstinner
    Copy link
    Member Author

    The warning is a race condition which can be reproduced easily on Linux using attached test_handle_called_with_mp_queue-bug.patch, run:

    haypo@selma$ ./python -m test --fail-env-changed -m test_handle_called_with_mp_queue test_logging
    Run tests sequentially
    0:00:00 load avg: 0.22 [1/1] test_logging
    Warning -- threading_cleanup() failed to cleanup 20 threads after 0 sec (count: 20, dangling: 21)
    Warning -- threading._dangling was modified by test_logging
    Before: <_weakrefset.WeakSet object at 0x7fe1df5302c8>
    After: <_weakrefset.WeakSet object at 0x7fe1df5338e0>
    test_logging failed (env changed)

    1 test altered the execution environment:
    test_logging

    Total duration: 718 ms
    Tests result: ENV CHANGED

    @vstinner
    Copy link
    Member Author

    #2642 fixes the warning. I tested the change with test_handle_called_with_mp_queue-bug.patch: no more warning.

    Sorry, I don't know multiprocessing to understand the purpose of the removed test.

    I would like to really make sure that a Queue object doesn't "leak" a thread when I close .close() + .join_thread(). It's surprising that .join_thread() doesn't join anything and leave a thread running in the background. Even if in the common case, when the system load is low, the thread quits quickly thanks to .close().

    @vstinner vstinner changed the title test_handle_called_with_mp_queue() of test_logging leaks a thread on AMD64 FreeBSD 10.x Shared 3.x multiprocessing.Queue.join_thread() does nothing if created and use in the same process Jul 10, 2017
    @vstinner
    Copy link
    Member Author

    Hum, interesting, created_by_this_process was already removed from Python 2.7 in bpo-4106:

    commit 77657e4
    Author: Antoine Pitrou <solipsis@pitrou.net>
    Date: Wed Aug 24 22:41:05 2011 +0200

    Issue bpo-4106: Fix occasional exceptions printed out by multiprocessing on interpreter shutdown.
    
    This bug doesn't seem to exist on 3.2, where daemon threads are killed
    before Py_Finalize() is entered.
    

    @pitrou
    Copy link
    Member

    pitrou commented Jul 10, 2017

    I would like to really make sure that a Queue object doesn't "leak" a thread when I close .close() + .join_thread().

    I don't understand how this happens. The Finalize object only acts as an atexit handler. When called as a regular finalize, self._thread is dead and therefore _finalize_join() doesn't do anything.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 10, 2017

    Oh, that's because you're calling join_thread() explicitly. I see. I agree that the fix looks desirable then.

    @vstinner
    Copy link
    Member Author

    I don't understand how this happens.

    If you run "./python -m test --fail-env-changed -m test_handle_called_with_mp_queue test_logging" with attached test_handle_called_with_mp_queue-bug.patch, no finalizer is registered: .join_thread() does nothing, because created_by_this_process is true.

    @vstinner
    Copy link
    Member Author

    Oh, that's because you're calling join_thread() explicitly. I see. I agree that the fix looks desirable then.

    FYI I added join_thread() in my first attempt to fix "Warning -- threading._dangling was modified by test_logging": bpo-30131, commit 8ca2f2f. I expected that join_thread() would... join the thread :-)

    @vstinner
    Copy link
    Member Author

    I suggest to backport the fix up to Python 3.5.

    @vstinner
    Copy link
    Member Author

    New changeset 3b69d91 by Victor Stinner in branch 'master':
    bpo-30886: Fix multiprocessing.Queue.join_thread() (bpo-2642)
    3b69d91

    @vstinner
    Copy link
    Member Author

    New changeset 69e4180 by Victor Stinner in branch '3.5':
    bpo-30886: Fix multiprocessing.Queue.join_thread() (bpo-2642) (bpo-2644)
    69e4180

    @vstinner
    Copy link
    Member Author

    New changeset 7f3d65d by Victor Stinner in branch '3.6':
    bpo-30886: Fix multiprocessing.Queue.join_thread() (bpo-2642) (bpo-2643)
    7f3d65d

    @vstinner
    Copy link
    Member Author

    Ok, I applied my fix to 3.5, 3.6 and master branches.

    Thanks for the review Antoine.

    @vstinner
    Copy link
    Member Author

    I'm not sure that the bug is fully fixed, I still saw a warning on:
    http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.x%20Shared%203.x/builds/561/

    This build tested the commit aa8d0a2 which is more recent than commit 3b69d91.

    test_handle_called_with_mp_queue (test.test_logging.QueueListenerTest) ...

    Warning -- threading_cleanup() failed to cleanup -1 threads after 5 sec (count: 0, dangling: 1)

    ok

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life performance Performance or resource usage tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants