Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_multiprocessing_spawn leaked [1, 2, 1] memory blocks on AMD64 Windows8.1 Refleaks 3.7 #77916

Closed
vstinner opened this issue Jun 1, 2018 · 20 comments
Labels
3.7 (EOL) end of life OS-windows performance Performance or resource usage tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

vstinner commented Jun 1, 2018

BPO 33735
Nosy @pfmoore, @vstinner, @tjguk, @zware, @zooba, @pablogsal, @miss-islington, @tirkarthi
PRs
  • bpo-33735: Fix test_multiprocessing random failure #8059
  • [3.7] bpo-33735: Fix test_multiprocessing random failure (GH-8059) #8060
  • [2.7] bpo-33735: Fix test_multiprocessing random failure (GH-8059) #8061
  • [3.6] bpo-33735: Fix test_multiprocessing random failure (GH-8059) #8062
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-12-18.21:55:24.283>
    created_at = <Date 2018-06-01.14:58:07.146>
    labels = ['3.7', 'tests', 'OS-windows', 'performance']
    title = 'test_multiprocessing_spawn leaked [1, 2, 1] memory blocks on AMD64 Windows8.1 Refleaks 3.7'
    updated_at = <Date 2018-12-18.21:55:24.281>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2018-12-18.21:55:24.281>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-12-18.21:55:24.283>
    closer = 'vstinner'
    components = ['Tests', 'Windows']
    creation = <Date 2018-06-01.14:58:07.146>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 33735
    keywords = ['patch']
    message_count = 20.0
    messages = ['318425', '318427', '319473', '320057', '320162', '320596', '320946', '320948', '320950', '320956', '320958', '320962', '320966', '320968', '320970', '320971', '321093', '321097', '321401', '332087']
    nosy_count = 8.0
    nosy_names = ['paul.moore', 'vstinner', 'tim.golden', 'zach.ware', 'steve.dower', 'pablogsal', 'miss-islington', 'xtreak']
    pr_nums = ['8059', '8060', '8061', '8062']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue33735'
    versions = ['Python 3.7']

    @vstinner
    Copy link
    Member Author

    vstinner commented Jun 1, 2018

    http://buildbot.python.org/all/#/builders/132/builds/154

    test_multiprocessing_spawn leaked [1, 2, 1] memory blocks, sum=4

    @vstinner vstinner added 3.7 (EOL) end of life tests Tests in the Lib/test dir OS-windows performance Performance or resource usage labels Jun 1, 2018
    @vstinner
    Copy link
    Member Author

    vstinner commented Jun 1, 2018

    When running "python -m test -R 2:3 test_multiprocessing_forkserver" on Windows, I saw some warnings about dangling threads. It may explain this issue.

    @vstinner
    Copy link
    Member Author

    bpo-33853 has been marked as a duplicate this bug.

    @vstinner
    Copy link
    Member Author

    http://buildbot.python.org/all/#/builders/132/builds/154

    That's the AMD64 Windows8.1 Refleaks 3.7 buildbot.

    bpo-33853 has been marked as a duplicate this bug.

    Copy of Pablo's message:

    The test test_multiprocessing_spawn is leaking memory according to the x86 Gentoo Refleaks 3.x buildbot:

    x86 Gentoo Refleaks 3.x
    http://buildbot.python.org/all/#/builders/1/builds/253

    test_multiprocessing_spawn leaked [1, 2, 1] memory blocks, sum=4
    1 test failed again:
    test_multiprocessing_spawn

    x86 Gentoo Refleaks 3.7
    http://buildbot.python.org/all/#/builders/114/builds/135

    @vstinner
    Copy link
    Member Author

    I just created bpo-33929: "test_multiprocessing_spawn: WithProcessesTestProcess.test_many_processes() leaks 5 handles on Windows". See also my PR 7827.

    @vstinner
    Copy link
    Member Author

    I also created bpo-33966: "test_multiprocessing_spawn.WithProcessesTestPool.test_traceback() leaks 4 handles on Windows".

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 3, 2018

    I reproduced the issue on the Gentoo Refleak buildbot and I succeeded to bisect up to a single test: test_imap_unordered().

    pydev@stormageddon ~/cpython $ ./python -m test test_multiprocessing_spawn -m test.test_multiprocessing_spawn.WithProcessesTestPool.test_imap_unordered -R 3:3
    Run tests sequentially
    0:00:00 load avg: 1.45 [1/1] test_multiprocessing_spawn
    beginning 6 repetitions
    123456
    ......
    test_multiprocessing_spawn leaked [3, 2, 1] memory blocks, sum=6
    test_multiprocessing_spawn failed

    == Tests result: FAILURE ==

    1 test failed:
    test_multiprocessing_spawn

    Total duration: 6 sec 548 ms
    Tests result: FAILURE

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 3, 2018

    It doesn't look like a real leak, but more a cache which takes multiple iterations to be fully filled.

    pydev@stormageddon ~/cpython $ ./python -m test test_multiprocessing_spawn -m test.test_multiprocessing_spawn.WithProcessesTestPool.test_imap_unordered -R 1:30
    WARNING: Running tests with --huntrleaks/-R and less than 3 warmup repetitions can give false positives!
    Run tests sequentially
    0:00:00 load avg: 0.88 [1/1] test_multiprocessing_spawn
    beginning 31 repetitions
    1234567890123456789012345678901
    ...............................
    test_multiprocessing_spawn leaked [4, 5, 1, 5, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] memory blocks, sum=18
    test_multiprocessing_spawn failed in 42 sec 470 ms

    == Tests result: FAILURE ==

    1 test failed:
    test_multiprocessing_spawn

    Total duration: 42 sec 490 ms
    Tests result: FAILURE

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 3, 2018

    test_multiprocessing_spawn leaked [4, 5, 1, 5, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] memory blocks, sum=18

    Sorry, I forgot to mention that I modified libregrtest to get this output:

    diff --git a/Lib/test/libregrtest/refleak.py b/Lib/test/libregrtest/refleak.py
    index 6724488fcf..a3c50e21e0 100644
    --- a/Lib/test/libregrtest/refleak.py
    +++ b/Lib/test/libregrtest/refleak.py
    @@ -101,7 +101,7 @@ def dash_R(the_module, test, indirect_test, huntrleaks):
         failed = False
         for deltas, item_name, checker in [
             (rc_deltas, 'references', check_rc_deltas),
    -        (alloc_deltas, 'memory blocks', check_rc_deltas),
    +        (alloc_deltas, 'memory blocks', check_fd_deltas),
             (fd_deltas, 'file descriptors', check_fd_deltas)
         ]:
             # ignore warmup runs

    @tirkarthi
    Copy link
    Member

    It failed for the first time on Ubuntu and then was successful for all the rest of 5-6 runs. I don't know why for the failure run it has load avg as 0.00 and how to get to this stage.

    # Shell session

    ➜  cpython git:(master) uname -a
    Linux ubuntu-s-1vcpu-1gb-blr1-01 4.4.0-127-generic #153-Ubuntu SMP Sat May 19 10:58:46 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
    ➜  cpython git:(master) ./python
    Python 3.8.0a0 (heads/master:d824ca7, Jul  3 2018, 06:50:05)
    [GCC 5.4.0 20160609] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> exit()
    ➜  cpython git:(master) ./python -m test test_multiprocessing_spawn -m test.test_multiprocessing_spawn.WithProcessesTestPool.test_imap_unordered -R 3:3
    Run tests sequentially
    0:00:00 load avg: 0.00 [1/1] test_multiprocessing_spawn
    beginning 6 repetitions
    123456
    ......
    test_multiprocessing_spawn leaked [2, 2, 1] memory blocks, sum=5
    test_multiprocessing_spawn failed

    == Tests result: FAILURE ==

    1 test failed:
    test_multiprocessing_spawn

    Total duration: 9 sec 221 ms
    Tests result: FAILURE
    ➜ cpython git:(master) ./python -m test test_multiprocessing_spawn -m test.test_multiprocessing_spawn.WithProcessesTestPool.test_imap_unordered -R 3:3
    Run tests sequentially
    0:00:00 load avg: 0.34 [1/1] test_multiprocessing_spawn
    beginning 6 repetitions
    123456
    ......

    == Tests result: SUCCESS ==

    1 test OK.

    Total duration: 8 sec 822 ms
    Tests result: SUCCESS

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 3, 2018

    It failed for the first time on Ubuntu and then was successful for all the rest of 5-6 runs.

    The bug is random. But the problem is that sometimes, it fails. It must never fail, otherwise the buildbot fails randomly. The Gentoo Refleak buildbot runs multiple tests in parallel and so its system load is high, tests are run slower, making the failure more likely.

    Anyway, I have a fix! PR 8059.

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 3, 2018

    New changeset 23401fb by Victor Stinner in branch 'master':
    bpo-33735: Fix test_multiprocessing random failure (GH-8059)
    23401fb

    @miss-islington
    Copy link
    Contributor

    New changeset 42b2f84 by Miss Islington (bot) in branch '3.7':
    bpo-33735: Fix test_multiprocessing random failure (GH-8059)
    42b2f84

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 3, 2018

    New changeset 53fafaf by Victor Stinner in branch '2.7':
    bpo-33735: Fix test_multiprocessing random failure (GH-8059) (GH-8061)
    53fafaf

    @miss-islington
    Copy link
    Contributor

    New changeset 3bd9d3b by Miss Islington (bot) in branch '3.6':
    bpo-33735: Fix test_multiprocessing random failure (GH-8059)
    3bd9d3b

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 3, 2018

    bpo-33984 has been marked as a duplicate of this issue: "test_multiprocessing_forkserver leaked [1, 2, 1] memory blocks on x86 Gentoo Refleaks 3.x".

    Sadly, my commit 23401fb is not perfect, the test still fails when the system load is high:
    https://bugs.python.org/issue33984#msg320967

    But since tests are re-run sequentially, I hope that it will be fine.

    I close the issue. I will reopen it if the bug reoccurs.

    @vstinner vstinner closed this as completed Jul 3, 2018
    @pablogsal
    Copy link
    Member

    We have some similar failures on the x86 Gentoo Refleaks 3.7 buildbots:

    http://buildbot.python.org/all/#/builders/114/builds/157/steps/4/logs/stdio
    http://buildbot.python.org/all/#/builders/114/builds/155/steps/4/logs/stdio
    ----------------------------------------------------------------------
    Ran 310 tests in 249.377s
    OK (skipped=30)
    .
    test_multiprocessing_spawn leaked [1, 2, 2] memory blocks, sum=5
    1 test failed again:
    test_multiprocessing_spawn
    == Tests result: FAILURE then FAILURE ==

    It seems that is due to a high load on the buildbot but I am surprised that this is not mitigated after PR 8059.

    @vstinner
    Copy link
    Member Author

    vstinner commented Jul 5, 2018

    Ok ok, let me be honest with myself, my *workaround* change is not reliable :-)

    @vstinner vstinner reopened this Jul 5, 2018
    @vstinner
    Copy link
    Member Author

    Note for myself: the commit 127bd9b, fix for bpo-34042 has no impact on this issue since this issue is about memory blocks and not references.

    @vstinner
    Copy link
    Member Author

    Even if the code isn't perfect, I didn't see the failure recently. So I close the bug again.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life OS-windows performance Performance or resource usage tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants