Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance support.reap_children() #75343

Closed
vstinner opened this issue Aug 9, 2017 · 23 comments
Closed

Enhance support.reap_children() #75343

vstinner opened this issue Aug 9, 2017 · 23 comments

Comments

@vstinner
Copy link
Member

vstinner commented Aug 9, 2017

BPO 31160
Nosy @vstinner, @aixtools
PRs
  • bpo-31160: Enhance support.reap_children() #3036
  • [WIP] bpo-31160: test PR used to bisect reap_children() warnings #3040
  • bpo-31160: Fix test_builtin for zombie process #3043
  • bpo-31160: regrtest now reaps child processes #3044
  • bpo-31160: Fix test_random for zombie process #3045
  • [3.6] bpo-31160: Backport reap_children() fixes from master to 3.6 #3046
  • bpo-31160: test_tempfile: Fix reap_children() warning #3056
  • [3.6] bpo-31160: Backport reap_children() fixes from master to 3.6 #3060
  • [2.7] bpo-31160: Backport reap_children fixes from master to 2.7 #3063
  • bpo-31160: Fix race condition in test_os.PtyTests #19263
  • bpo-40155: Stop test_builtin from hanging on AIX, Solaris and maybe others. #19308
  • bpo-31160: Fix test_builtin.test_input_no_stdout_fileno() #19312
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-08-11.00:12:50.391>
    created_at = <Date 2017-08-09.12:50:03.350>
    labels = []
    title = 'Enhance support.reap_children()'
    updated_at = <Date 2020-04-02.20:08:02.894>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2020-04-02.20:08:02.894>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-08-11.00:12:50.391>
    closer = 'vstinner'
    components = []
    creation = <Date 2017-08-09.12:50:03.350>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31160
    keywords = []
    message_count = 23.0
    messages = ['299996', '299997', '299998', '300000', '300020', '300021', '300023', '300024', '300072', '300073', '300086', '300088', '300093', '300147', '365430', '365474', '365475', '365476', '365477', '365478', '365479', '365512', '365517']
    nosy_count = 2.0
    nosy_names = ['vstinner', 'Michael.Felt']
    pr_nums = ['3036', '3040', '3043', '3044', '3045', '3046', '3056', '3060', '3063', '19263', '19308', '19312']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue31160'
    versions = []

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    Attached PR enhances the support.reap_children() function:

    • reap_children() now sets environment_altered to True to detect bugs using python3 -m test --fail-env-changed
    • Replace bare "except:" with "except OSError:" in reap_children()
    • Write an unit test for reap_children() using a timeout of 60 seconds

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    The GCC job of Travis CI failed with ENV_CHANGED:

    Test wait() behavior when waitpid returns WIFSTOPPED; bpo-29335. ...
    Warning -- reap_children() reaped child process 19839
    ok

    I tested and... WOW! When run in a loop, this test leaks 100 MB per second. It creates a lot of processes.

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    The GCC job of Travis CI failed with ENV_CHANGED:

    Ooops, in fact it was a macOS job:
    https://travis-ci.org/python/cpython/jobs/262606830

    The GCC job failed with much more errors:
    https://travis-ci.org/python/cpython/jobs/262606831

    ---
    0:00:02 load avg: 121.34 [ 7/403] test_unittest
    Warning -- reap_children() reaped child process 11088
    Warning -- reap_children() reaped child process 11089
    Warning -- reap_children() reaped child process 11090
    Warning -- reap_children() reaped child process 11091

    0:14:22 load avg: 136.67 [282/403/1] test_select
    Warning -- reap_children() reaped child process 14686

    0:16:00 load avg: 110.41 [297/403/1] test_socketserver
    Warning -- reap_children() reaped child process 15483
    Warning -- reap_children() reaped child process 15492
    Warning -- reap_children() reaped child process 15499
    Warning -- reap_children() reaped child process 15508

    0:18:36 load avg: 105.94 [333/403/1] test_thread
    Warning -- reap_children() reaped child process 20670
    ---

    For test_socketserver, see bpo-31151.

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    A reap_children() warning was fixed in test_thread: bpo-31150. It seems like the commit 88eee44 was not enough to fix all warnings.

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    New changeset 4baca1b by Victor Stinner in branch 'master':
    bpo-31160: Fix test_builtin for zombie process (bpo-3043)
    4baca1b

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    New changeset e3510d7 by Victor Stinner in branch 'master':
    bpo-31160: regrtest now reaps child processes (bpo-3044)
    e3510d7

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    New changeset da5e930 by Victor Stinner in branch 'master':
    bpo-31160: Fix test_random for zombie process (bpo-3045)
    da5e930

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 9, 2017

    New changeset 4baca1b by Victor Stinner in branch 'master':
    bpo-31160: Fix test_builtin for zombie process (bpo-3043)

    This change introduced a regression:

    http://buildbot.python.org/all/builders/AMD64%20Debian%20root%203.x/builds/1159/steps/test/logs/stdio

    ======================================================================
    FAIL: test_input_no_stdout_fileno (test.test_builtin.PtyTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/test_builtin.py", line 1624, in test_input_no_stdout_fileno
        lines = self.run_child(child, b"quux\r")
      File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/test_builtin.py", line 1573, in run_child
        self.assertEqual(status, 0)
    AssertionError: 1 != 0

    @vstinner
    Copy link
    Member Author

    bpo-31173 fixed a leaked child process in test_subprocess.

    @vstinner
    Copy link
    Member Author

    New changeset 6c8c294 by Victor Stinner in branch 'master':
    bpo-31160: test_tempfile: Fix reap_children() warning (bpo-3056)
    6c8c294

    @vstinner
    Copy link
    Member Author

    bpo-31151 fixed test_socketserver.

    @vstinner
    Copy link
    Member Author

    New changeset 719a15b by Victor Stinner in branch '3.6':
    [3.6] bpo-31160: Backport reap_children() fixes from master to 3.6 (bpo-3060)
    719a15b

    @vstinner
    Copy link
    Member Author

    New changeset 1247e2c by Victor Stinner in branch '2.7':
    [2.7] bpo-31160: Backport reap_children fixes from master to 2.7 (bpo-3063)
    1247e2c

    @vstinner
    Copy link
    Member Author

    I pushed the most important change: reap_children() now makes tests fail with ENV_CHANGED on warning, so I close the issue.

    @vstinner
    Copy link
    Member Author

    New changeset 16d7567 by Victor Stinner in branch 'master':
    bpo-31160: Fix race condition in test_os.PtyTests (GH-19263)
    16d7567

    @aixtools
    Copy link
    Contributor

    aixtools commented Apr 1, 2020

    With PR19263 The AIX bots are now red.

    ======================================================================
    ERROR: test_input_no_stdout_fileno (test.test_builtin.PtyTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_builtin.py", line 1952, in test_input_no_stdout_fileno
        lines = self.run_child(child, b"quux\r")
      File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/test_builtin.py", line 1898, in run_child
        support.wait_process(pid, exitcode=0)
      File "/home/shager/cpython-buildarea/3.x.edelsohn-aix-ppc64/build/Lib/test/support/__init__.py", line 3432, in wait_process
        os.kill(pid, signal.SIGKILL)
    NameError: name 'signal' is not defined

    Ran 101 tests in 30.348s
    FAILED (errors=1, skipped=7)
    1 test failed again:
    test_builtin

    +++++++++++++++
    The Buildbot has detected a failed build on builder PPC64 AIX 3.x while building python/cpython.
    Full details are available at:
    https://buildbot.python.org/all/#builders/227/builds/565

    Buildbot URL: https://buildbot.python.org/all/

    Worker for this Build: edelsohn-aix-ppc64
    Worker for this Build: aixtools-aix-power6

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 1, 2020

    With PR19263 The AIX bots are now red.

    I know, I saw and I already pushed fixes.

    NameError: name 'signal' is not defined

    Fixed by commit afeaea2.

    1 test failed again: test_builtin

    Fixed by commit 16d7567.

    @aixtools
    Copy link
    Contributor

    aixtools commented Apr 1, 2020

    Ah - great. Sorry for the noise then.

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 1, 2020

    Ah - great. Sorry for the noise then.

    It's not noise, it is useful :-)

    @aixtools
    Copy link
    Contributor

    aixtools commented Apr 1, 2020

    I think something is not yet what it needs to be:

    the bots both finish test with:

    test_zip_pickle (test.test_builtin.BuiltinTest) ... ok
    Timeout (0:15:00)!
    Thread 0x00000001 (most recent call first):
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/support/init.py", line 3435 in wait_process
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_builtin.py", line 1898 in run_child
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_builtin.py", line 1952 in test_input_no_stdout_fileno
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/case.py", line 616 in _callTestMethod
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/case.py", line 659 in run
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/case.py", line 719 in __call__
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 122 in run
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 84 in __call__
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 122 in run
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 84 in __call__
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 122 in run
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 84 in __call__
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/runner.py", line 176 in run
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/support/init.py", line 2079 in _run_suite
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/support/init.py", line 2201 in run_unittest
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/runtest.py", line 209 in _test_module
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/runtest.py", line 234 in _runtest_inner2
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/runtest.py", line 270 in _runtest_inner
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/runtest.py", line 153 in _runtest
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/runtest.py", line 193 in runtest
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/main.py", line 318 in rerun_failed_tests
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/main.py", line 691 in _main
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/main.py", line 634 in main
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/main.py", line 712 in main
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/main.py", line 2 in <module>
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/runpy.py", line 87 in _run_code
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/runpy.py", line 197 in _run_module_as_main
    make: 1254-004 The error code from the last command is 1.
    Stop.
    program finished with exit code 2
    elapsedTime=3501.292487
    test_input_no_stdout_fileno (test.test_builtin.PtyTests) ...

    And the bot status is still FAIL (aka red): failed test (failure) uploading test-results.xml (failure)

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 1, 2020

    I think something is not yet what it needs to be: (...)

    https://buildbot.python.org/all/#/builders/227/builds/571 build failed but it has my commit 16d7567. Ok, something failed.

    Please open a new issue. This one is closed.

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 1, 2020

    Please open a new issue. This one is closed.

    Pablo Galindo opened bpo-40140, let's use this one.

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 1, 2020

    Pablo Galindo opened bpo-40140, let's use this one.

    Note: Oops, Batuhan created it, Pablo only commented.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants