classification
Title: multiprocessing.Queue.join_thread() does nothing if created and use in the same process
Type: resource usage Stage: resolved
Components: Tests Versions: Python 3.7, Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: davin, pitrou, vstinner
Priority: normal Keywords: patch

Created on 2017-07-09 23:48 by vstinner, last changed 2017-07-10 21:40 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
test_handle_called_with_mp_queue-bug.patch vstinner, 2017-07-10 09:17
Pull Requests
URL Status Linked Edit
PR 2642 merged vstinner, 2017-07-10 09:24
PR 2643 merged vstinner, 2017-07-10 10:46
PR 2644 merged vstinner, 2017-07-10 10:47
Messages (18)
msg298008 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-09 23:48
http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.x%20Shared%203.x/builds/557/steps/test/logs/stdio

test_handle_called_with_mp_queue (test.test_logging.QueueListenerTest) ... Warning -- threading_cleanup() failed to cleanup -1 threads after 3 sec (count: 0, dangling: 1)
ok
msg298009 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-09 23:55
The load average was 3.15:

0:04:33 load avg: 3.15 [176/406] test_logging failed (env changed)

--

Another fail on AMD64 FreeBSD CURRENT Non-Debug 3.x:

http://buildbot.python.org/all/builders/AMD64%20FreeBSD%20CURRENT%20Non-Debug%203.x/builds/568/steps/test/logs/stdio

0:01:56 load avg: 3.45 [ 44/406] test_logging failed (env changed)
...
test_output (test.test_logging.UnixSocketHandlerTest) ... ok
test_output (test.test_logging.UnixDatagramHandlerTest) ... ok
test_output (test.test_logging.UnixSysLogHandlerTest) ... ok
test__all__ (test.test_logging.MiscTestCase) ... ok
test_handle_called_with_mp_queue (test.test_logging.QueueListenerTest) ... Warning -- threading_cleanup() failed to cleanup -1 threads after 4 sec (count: 0, dangling: 1)
ok
test_handle_called_with_queue_queue (test.test_logging.QueueListenerTest) ... ok
test_no_messages_in_queue_after_stop (test.test_logging.QueueListenerTest) ... ok
msg298010 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 00:01
Previous issue which fixed QueueListenerTest of test_logging is bpo-30131:

commit 8ca2f2faefa8dba323a2e4c4b86efb633d7a53cf
Author: Victor Stinner <victor.stinner@gmail.com>
Date:   Wed Apr 26 15:56:25 2017 +0200

    bpo-30131: test_logging now joins queue threads (#1298)
    
    QueueListenerTest of test_logging now closes the multiprocessing
    Queue and joins its thread to prevent leaking dangling threads to
    following tests.
    
    Add also @support.reap_threads to detect earlier if a test leaks
    threads (and try to "cleanup" these threads).
msg298011 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 00:16
While trying to reproduce the bug, I got:

test_handle_called_with_mp_queue (test.test_logging.QueueListenerTest) ... /usr/home/haypo/cpython/Lib/test/support/__init__.py:1515: ResourceWarning: unclosed <socket.socket fd=6, family=AddressFamily.AF_INET, type=536870913, proto=0, laddr=('127.0.0.1', 8166), raddr=('127.0.0.1', 8167)>
  gc.collect()
ok
msg298014 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 00:30
The problem is that multiprocessing.Queue.join_thread() does nothing since the thread wasn't started by a subprocess.

See also bpo-30171: Emit ResourceWarning in multiprocessing Queue destructor.
msg298037 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 09:17
The warning is a race condition which can be reproduced easily on Linux using attached test_handle_called_with_mp_queue-bug.patch, run:

haypo@selma$ ./python -m test --fail-env-changed  -m test_handle_called_with_mp_queue  test_logging
Run tests sequentially
0:00:00 load avg: 0.22 [1/1] test_logging
Warning -- threading_cleanup() failed to cleanup 20 threads after 0 sec (count: 20, dangling: 21)
Warning -- threading._dangling was modified by test_logging
  Before: <_weakrefset.WeakSet object at 0x7fe1df5302c8>
  After:  <_weakrefset.WeakSet object at 0x7fe1df5338e0> 
test_logging failed (env changed)

1 test altered the execution environment:
    test_logging

Total duration: 718 ms
Tests result: ENV CHANGED
msg298038 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 09:27
https://github.com/python/cpython/pull/2642 fixes the warning. I tested the change with test_handle_called_with_mp_queue-bug.patch: no more warning.

Sorry, I don't know multiprocessing to understand the purpose of the removed test.

I would like to really make sure that a Queue object doesn't "leak" a thread when I close .close() + .join_thread(). It's surprising that .join_thread() doesn't join anything and leave a thread running in the background. Even if in the common case, when the system load is low, the thread quits quickly thanks to .close().
msg298040 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 09:35
Hum, interesting, created_by_this_process was already removed from Python 2.7 in bpo-4106:

commit 77657e40fa5f43fe6f7ffb6e32da4613dba657e1
Author: Antoine Pitrou <solipsis@pitrou.net>
Date:   Wed Aug 24 22:41:05 2011 +0200

    Issue #4106: Fix occasional exceptions printed out by multiprocessing on interpreter shutdown.
    
    This bug doesn't seem to exist on 3.2, where daemon threads are killed
    before Py_Finalize() is entered.
msg298041 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-07-10 09:39
> I would like to really make sure that a Queue object doesn't "leak" a thread when I close .close() + .join_thread().

I don't understand how this happens.  The Finalize object only acts as an atexit handler.  When called as a regular finalize, `self._thread` is dead and therefore `_finalize_join()` doesn't do anything.
msg298042 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2017-07-10 09:40
Oh, that's because you're calling join_thread() explicitly.  I see.  I agree that the fix looks desirable then.
msg298043 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 09:40
> I don't understand how this happens.

If you run "./python -m test --fail-env-changed  -m test_handle_called_with_mp_queue  test_logging" with attached  test_handle_called_with_mp_queue-bug.patch, no finalizer is registered: .join_thread() does nothing, because created_by_this_process is true.
msg298051 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 10:21
> Oh, that's because you're calling join_thread() explicitly.  I see.  I agree that the fix looks desirable then.

FYI I added join_thread() in my first attempt to fix "Warning -- threading._dangling was modified by test_logging": bpo-30131, commit 8ca2f2faefa8dba323a2e4c4b86efb633d7a53cf. I expected that join_thread() would... join the thread :-)
msg298052 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 10:22
I suggest to backport the fix up to Python 3.5.
msg298054 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 10:45
New changeset 3b69d911c57ef591ac0c0f47a66dbcad8337f33a by Victor Stinner in branch 'master':
bpo-30886: Fix multiprocessing.Queue.join_thread() (#2642)
https://github.com/python/cpython/commit/3b69d911c57ef591ac0c0f47a66dbcad8337f33a
msg298055 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 11:43
New changeset 69e41807f0851ff1107f949dcdc94dbb0af32acd by Victor Stinner in branch '3.5':
bpo-30886: Fix multiprocessing.Queue.join_thread() (#2642) (#2644)
https://github.com/python/cpython/commit/69e41807f0851ff1107f949dcdc94dbb0af32acd
msg298056 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 11:43
New changeset 7f3d65d6e4f8bebaaf996efb1c1adb67eb1724cb by Victor Stinner in branch '3.6':
bpo-30886: Fix multiprocessing.Queue.join_thread() (#2642) (#2643)
https://github.com/python/cpython/commit/7f3d65d6e4f8bebaaf996efb1c1adb67eb1724cb
msg298057 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 11:49
Ok, I applied my fix to 3.5, 3.6 and master branches.

Thanks for the review Antoine.
msg298089 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-07-10 21:40
I'm not sure that the bug is fully fixed, I still saw a warning on:
http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.x%20Shared%203.x/builds/561/

This build tested the commit aa8d0a24694bea05061f1920ec3f944a9e6799d5 which is more recent than commit 3b69d911c57ef591ac0c0f47a66dbcad8337f33a.

test_handle_called_with_mp_queue (test.test_logging.QueueListenerTest) ...

Warning -- threading_cleanup() failed to cleanup -1 threads after 5 sec (count: 0, dangling: 1)

ok
History
Date User Action Args
2017-07-10 21:40:03vstinnersetmessages: + msg298089
2017-07-10 11:49:09vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg298057

stage: resolved
2017-07-10 11:43:22vstinnersetmessages: + msg298056
2017-07-10 11:43:19vstinnersetmessages: + msg298055
2017-07-10 10:47:07vstinnersetpull_requests: + pull_request2709
2017-07-10 10:46:38vstinnersetpull_requests: + pull_request2708
2017-07-10 10:45:25vstinnersetmessages: + msg298054
2017-07-10 10:22:06vstinnersetmessages: + msg298052
versions: + Python 3.5, Python 3.6
2017-07-10 10:21:47vstinnersetmessages: + msg298051
2017-07-10 09:40:30vstinnersetmessages: + msg298043
2017-07-10 09:40:20pitrousetmessages: + msg298042
2017-07-10 09:39:14pitrousetmessages: + msg298041
2017-07-10 09:35:17vstinnersetmessages: + msg298040
2017-07-10 09:27:39vstinnersetnosy: + pitrou, davin

title: test_handle_called_with_mp_queue() of test_logging leaks a thread on AMD64 FreeBSD 10.x Shared 3.x -> multiprocessing.Queue.join_thread() does nothing if created and use in the same process
2017-07-10 09:27:06vstinnersetmessages: + msg298038
2017-07-10 09:24:34vstinnersetpull_requests: + pull_request2707
2017-07-10 09:17:50vstinnersetfiles: + test_handle_called_with_mp_queue-bug.patch
keywords: + patch
messages: + msg298037
2017-07-10 00:30:35vstinnersetmessages: + msg298014
2017-07-10 00:16:33vstinnersetmessages: + msg298011
2017-07-10 00:01:09vstinnersetmessages: + msg298010
2017-07-09 23:55:16vstinnersetmessages: + msg298009
2017-07-09 23:48:51vstinnercreate