classification
Title: test_multiprocessing_spawn leaked [1, 2, 1] memory blocks on AMD64 Windows8.1 Refleaks 3.7
Type: resource usage Stage: resolved
Components: Tests, Windows Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: miss-islington, pablogsal, paul.moore, steve.dower, tim.golden, vstinner, xtreak, zach.ware
Priority: normal Keywords: patch

Created on 2018-06-01 14:58 by vstinner, last changed 2018-07-10 23:47 by vstinner.

Pull Requests
URL Status Linked Edit
PR 8059 merged vstinner, 2018-07-03 11:06
PR 8060 merged miss-islington, 2018-07-03 11:21
PR 8061 merged vstinner, 2018-07-03 11:23
PR 8062 merged miss-islington, 2018-07-03 11:23
Messages (19)
msg318425 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-01 14:58
http://buildbot.python.org/all/#/builders/132/builds/154

test_multiprocessing_spawn leaked [1, 2, 1] memory blocks, sum=4
msg318427 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-01 15:39
When running "python -m test -R 2:3 test_multiprocessing_forkserver" on Windows, I saw some warnings about dangling threads. It may explain this issue.
msg319473 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-13 15:30
bpo-33853 has been marked as a duplicate this bug.
msg320057 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-20 10:27
> http://buildbot.python.org/all/#/builders/132/builds/154

That's the AMD64 Windows8.1 Refleaks 3.7 buildbot.

> bpo-33853 has been marked as a duplicate this bug.

Copy of Pablo's message:

The test `test_multiprocessing_spawn` is leaking memory according to the x86 Gentoo Refleaks 3.x buildbot:


x86 Gentoo Refleaks 3.x
http://buildbot.python.org/all/#/builders/1/builds/253

test_multiprocessing_spawn leaked [1, 2, 1] memory blocks, sum=4
1 test failed again:
    test_multiprocessing_spawn


x86 Gentoo Refleaks 3.7
http://buildbot.python.org/all/#/builders/114/builds/135
msg320162 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-21 11:08
I just created bpo-33929: "test_multiprocessing_spawn: WithProcessesTestProcess.test_many_processes() leaks 5 handles on Windows". See also my PR 7827.
msg320596 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-06-27 15:40
I also created bpo-33966: "test_multiprocessing_spawn.WithProcessesTestPool.test_traceback() leaks 4 handles on Windows".
msg320946 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-03 09:10
I reproduced the issue on the Gentoo Refleak buildbot and I succeeded to bisect up to a single test: test_imap_unordered().

pydev@stormageddon ~/cpython $ ./python -m test test_multiprocessing_spawn -m test.test_multiprocessing_spawn.WithProcessesTestPool.test_imap_unordered -R 3:3
Run tests sequentially
0:00:00 load avg: 1.45 [1/1] test_multiprocessing_spawn
beginning 6 repetitions
123456
......
test_multiprocessing_spawn leaked [3, 2, 1] memory blocks, sum=6
test_multiprocessing_spawn failed

== Tests result: FAILURE ==

1 test failed:
    test_multiprocessing_spawn

Total duration: 6 sec 548 ms
Tests result: FAILURE
msg320948 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-03 09:19
It doesn't look like a real leak, but more a cache which takes multiple iterations to be fully filled.

pydev@stormageddon ~/cpython $ ./python -m test test_multiprocessing_spawn -m test.test_multiprocessing_spawn.WithProcessesTestPool.test_imap_unordered -R 1:30
WARNING: Running tests with --huntrleaks/-R and less than 3 warmup repetitions can give false positives!
Run tests sequentially
0:00:00 load avg: 0.88 [1/1] test_multiprocessing_spawn
beginning 31 repetitions
1234567890123456789012345678901
...............................
test_multiprocessing_spawn leaked [4, 5, 1, 5, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] memory blocks, sum=18
test_multiprocessing_spawn failed in 42 sec 470 ms

== Tests result: FAILURE ==

1 test failed:
    test_multiprocessing_spawn

Total duration: 42 sec 490 ms
Tests result: FAILURE
msg320950 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-03 09:21
> test_multiprocessing_spawn leaked [4, 5, 1, 5, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] memory blocks, sum=18

Sorry, I forgot to mention that I modified libregrtest to get this output:

diff --git a/Lib/test/libregrtest/refleak.py b/Lib/test/libregrtest/refleak.py
index 6724488fcf..a3c50e21e0 100644
--- a/Lib/test/libregrtest/refleak.py
+++ b/Lib/test/libregrtest/refleak.py
@@ -101,7 +101,7 @@ def dash_R(the_module, test, indirect_test, huntrleaks):
     failed = False
     for deltas, item_name, checker in [
         (rc_deltas, 'references', check_rc_deltas),
-        (alloc_deltas, 'memory blocks', check_rc_deltas),
+        (alloc_deltas, 'memory blocks', check_fd_deltas),
         (fd_deltas, 'file descriptors', check_fd_deltas)
     ]:
         # ignore warmup runs
msg320956 - (view) Author: Karthikeyan Singaravelan (xtreak) * Date: 2018-07-03 10:49
It failed for the first time on Ubuntu and then was successful for all the rest of 5-6 runs. I don't know why for the failure run it has load avg as 0.00 and how to get to this stage.

# Shell session

➜  cpython git:(master) uname -a
Linux ubuntu-s-1vcpu-1gb-blr1-01 4.4.0-127-generic #153-Ubuntu SMP Sat May 19 10:58:46 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
➜  cpython git:(master) ./python
Python 3.8.0a0 (heads/master:d824ca7, Jul  3 2018, 06:50:05)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()
➜  cpython git:(master) ./python -m test test_multiprocessing_spawn -m test.test_multiprocessing_spawn.WithProcessesTestPool.test_imap_unordered -R 3:3
Run tests sequentially
0:00:00 load avg: 0.00 [1/1] test_multiprocessing_spawn
beginning 6 repetitions
123456
......
test_multiprocessing_spawn leaked [2, 2, 1] memory blocks, sum=5
test_multiprocessing_spawn failed

== Tests result: FAILURE ==

1 test failed:
    test_multiprocessing_spawn

Total duration: 9 sec 221 ms
Tests result: FAILURE
➜  cpython git:(master) ./python -m test test_multiprocessing_spawn -m test.test_multiprocessing_spawn.WithProcessesTestPool.test_imap_unordered -R 3:3
Run tests sequentially
0:00:00 load avg: 0.34 [1/1] test_multiprocessing_spawn
beginning 6 repetitions
123456
......

== Tests result: SUCCESS ==

1 test OK.

Total duration: 8 sec 822 ms
Tests result: SUCCESS
msg320958 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-03 11:15
> It failed for the first time on Ubuntu and then was successful for all the rest of 5-6 runs.

The bug is random. But the problem is that sometimes, it fails. It must never fail, otherwise the buildbot fails randomly. The Gentoo Refleak buildbot runs multiple tests in parallel and so its system load is high, tests are run slower, making the failure more likely.

Anyway, I have a fix! PR 8059.
msg320962 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-03 11:20
New changeset 23401fb960bb94e6ea62d2999527968d53d3fc65 by Victor Stinner in branch 'master':
bpo-33735: Fix test_multiprocessing random failure (GH-8059)
https://github.com/python/cpython/commit/23401fb960bb94e6ea62d2999527968d53d3fc65
msg320966 - (view) Author: miss-islington (miss-islington) Date: 2018-07-03 11:38
New changeset 42b2f84a85fe3334c5d8098cf76dfa727b348c4f by Miss Islington (bot) in branch '3.7':
bpo-33735: Fix test_multiprocessing random failure (GH-8059)
https://github.com/python/cpython/commit/42b2f84a85fe3334c5d8098cf76dfa727b348c4f
msg320968 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-03 11:43
New changeset 53fafafa340e292a1c9e4aacf5860999613c2b36 by Victor Stinner in branch '2.7':
bpo-33735: Fix test_multiprocessing random failure (GH-8059) (GH-8061)
https://github.com/python/cpython/commit/53fafafa340e292a1c9e4aacf5860999613c2b36
msg320970 - (view) Author: miss-islington (miss-islington) Date: 2018-07-03 11:51
New changeset 3bd9d3b934eb8b27553f19a7ab1c97f95745c68f by Miss Islington (bot) in branch '3.6':
bpo-33735: Fix test_multiprocessing random failure (GH-8059)
https://github.com/python/cpython/commit/3bd9d3b934eb8b27553f19a7ab1c97f95745c68f
msg320971 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-03 11:57
bpo-33984 has been marked as a duplicate of this issue: "test_multiprocessing_forkserver leaked [1, 2, 1] memory blocks on x86 Gentoo Refleaks 3.x".

Sadly, my commit 23401fb960bb94e6ea62d2999527968d53d3fc65 is not perfect, the test still fails when the system load is high:
https://bugs.python.org/issue33984#msg320967

But since tests are re-run sequentially, I hope that it will be fine.

I close the issue. I will reopen it if the bug reoccurs.
msg321093 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2018-07-05 10:06
We have some similar failures on the x86 Gentoo Refleaks 3.7 buildbots:

http://buildbot.python.org/all/#/builders/114/builds/157/steps/4/logs/stdio
http://buildbot.python.org/all/#/builders/114/builds/155/steps/4/logs/stdio
----------------------------------------------------------------------
Ran 310 tests in 249.377s
OK (skipped=30)
.
test_multiprocessing_spawn leaked [1, 2, 2] memory blocks, sum=5
1 test failed again:
    test_multiprocessing_spawn
== Tests result: FAILURE then FAILURE ==

It seems that is due to a high load on the buildbot but I am surprised that this is not mitigated after PR 8059.
msg321097 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-05 11:17
Ok ok, let me be honest with myself, my *workaround* change is not reliable :-)
msg321401 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-07-10 23:47
Note for myself: the commit 127bd9bfd591c8ec1a97eb7f4037c8b884eef973, fix for bpo-34042 has no impact on this issue since this issue is about memory blocks and not references.
History
Date User Action Args
2018-07-10 23:47:44vstinnersetmessages: + msg321401
2018-07-05 11:17:04vstinnersetstatus: closed -> open
resolution: fixed ->
messages: + msg321097
2018-07-05 10:06:36pablogsalsetnosy: + pablogsal
messages: + msg321093
2018-07-03 11:57:01vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg320971

stage: patch review -> resolved
2018-07-03 11:51:03miss-islingtonsetmessages: + msg320970
2018-07-03 11:43:27vstinnersetmessages: + msg320968
2018-07-03 11:38:58miss-islingtonsetnosy: + miss-islington
messages: + msg320966
2018-07-03 11:34:44vstinnerlinkissue33984 superseder
2018-07-03 11:23:47miss-islingtonsetpull_requests: + pull_request7671
2018-07-03 11:23:24vstinnersetpull_requests: + pull_request7670
2018-07-03 11:21:57miss-islingtonsetpull_requests: + pull_request7669
2018-07-03 11:20:38vstinnersetmessages: + msg320962
2018-07-03 11:15:42vstinnersetmessages: + msg320958
2018-07-03 11:06:18vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request7668
2018-07-03 10:49:11xtreaksetnosy: + xtreak
messages: + msg320956
2018-07-03 09:21:19vstinnersetmessages: + msg320950
2018-07-03 09:19:37vstinnersetmessages: + msg320948
2018-07-03 09:10:33vstinnersetmessages: + msg320946
2018-06-27 15:40:14vstinnersetmessages: + msg320596
2018-06-21 11:08:28vstinnersetmessages: + msg320162
2018-06-20 10:27:59vstinnersetmessages: + msg320057
2018-06-13 15:30:42vstinnersetmessages: + msg319473
2018-06-13 15:30:29vstinnerlinkissue33853 superseder
2018-06-01 15:39:13vstinnersetmessages: + msg318427
2018-06-01 14:58:07vstinnercreate