Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIX: test_builtin.test_input_no_stdout_fileno() hangs #84336

Closed
aixtools opened this issue Apr 2, 2020 · 13 comments
Closed

AIX: test_builtin.test_input_no_stdout_fileno() hangs #84336

aixtools opened this issue Apr 2, 2020 · 13 comments
Labels
3.9 only security fixes tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@aixtools
Copy link
Contributor

aixtools commented Apr 2, 2020

BPO 40155
Nosy @vstinner, @aixtools, @isidentical
PRs
  • bpo-40155: Stop test_builtin from hanging on AIX, Solaris and maybe others. #19308
  • bpo-31160: Fix test_builtin.test_input_no_stdout_fileno() #19312
  • Files
  • test_builtin.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-04-03.00:12:43.799>
    created_at = <Date 2020-04-02.12:34:23.656>
    labels = ['type-bug', 'tests', '3.9']
    title = 'AIX: test_builtin.test_input_no_stdout_fileno() hangs'
    updated_at = <Date 2020-04-03.09:38:01.425>
    user = 'https://github.com/aixtools'

    bugs.python.org fields:

    activity = <Date 2020-04-03.09:38:01.425>
    actor = 'Michael.Felt'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-04-03.00:12:43.799>
    closer = 'vstinner'
    components = ['Tests']
    creation = <Date 2020-04-02.12:34:23.656>
    creator = 'Michael.Felt'
    dependencies = []
    files = ['49027']
    hgrepos = []
    issue_num = 40155
    keywords = ['patch']
    message_count = 13.0
    messages = ['365583', '365585', '365586', '365597', '365598', '365599', '365600', '365601', '365602', '365619', '365642', '365649', '365687']
    nosy_count = 3.0
    nosy_names = ['vstinner', 'Michael.Felt', 'BTaskaya']
    pr_nums = ['19308', '19312']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue40155'
    versions = ['Python 3.9']

    @aixtools
    Copy link
    Contributor Author

    aixtools commented Apr 2, 2020

    related to

    Two AIX bots - different environments - continue to fail the test: test_builtin since

    During the first run - the test fails with something such as:

    0:31:47 [215/420] test_abc passed -- running: test_builtin (14 min 10 sec)
    0:32:17 running: test_builtin (14 min 40 sec), test_decimal (30.0 sec)
    0:32:37 [216/420] test_decimal passed (49.4 sec) -- running: test_builtin (15 min)
    0:32:37 [217/420/1] test_builtin crashed (Exit code 1)
    Timeout (0:15:00)!
    Thread 0x00000001 (most recent call first):

    And in the second pass (re-run of failed tests) the failure goes:

    test_zip (test.test_builtin.BuiltinTest) ... ok
    test_zip_bad_iterable (test.test_builtin.BuiltinTest) ... ok
    test_zip_pickle (test.test_builtin.BuiltinTest) ... ok
    Timeout (0:15:00)!
    Thread 0x00000001 (most recent call first):
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/support/init.py", line 3435 in wait_process
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_builtin.py", line 1898 in run_child
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/test_builtin.py", line 1952 in test_input_no_stdout_fileno
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/case.py", line 616 in _callTestMethod
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/case.py", line 659 in run
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/case.py", line 719 in __call__
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 122 in run
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 84 in __call__
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 122 in run
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 84 in __call__
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 122 in run
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/suite.py", line 84 in __call__
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/unittest/runner.py", line 176 in run
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/support/init.py", line 2079 in _run_suite
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/support/init.py", line 2201 in run_unittest
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/runtest.py", line 209 in _test_module
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/runtest.py", line 234 in _runtest_inner2
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/runtest.py", line 270 in _runtest_inner
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/runtest.py", line 153 in _runtest
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/runtest.py", line 193 in runtest
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/main.py", line 318 in rerun_failed_tests
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/main.py", line 691 in _main
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/main.py", line 634 in main
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/libregrtest/main.py", line 712 in main
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/test/main.py", line 2 in <module>
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/runpy.py", line 87 in _run_code
    File "/home/buildbot/buildarea/3.x.aixtools-aix-power6/build/Lib/runpy.py", line 197 in _run_module_as_main
    make: 1254-004 The error code from the last command is 1.
    Stop.
    program finished with exit code 2
    elapsedTime=3610.362052
    test_input_no_stdout_fileno (test.test_builtin.PtyTests) ...

    NOTE: the run has stopped with the line

    test_input_no_stdout_fileno (test.test_builtin.PtyTests) ...

    and, so the results also fail to get uploaded.

    @aixtools aixtools added 3.9 only security fixes tests Tests in the Lib/test dir labels Apr 2, 2020
    @aixtools
    Copy link
    Contributor Author

    aixtools commented Apr 2, 2020

    did not get issue numbers in above: bpo-31160 and bpo-40094.

    I waited a day, before posting - in the hope it would go away.

    Also, I have been testing manually (no -j arguments) - and test_builtin passes when run manually. So, becoming hard to dissect and propose a "cure".

    I have started with:

    ./python ./Tools/scripts/run_tests.py -j 1 -u all -W --slowest --fail-env-changed --timeout=1200 -j2

    and now see the test stalled:

    0:39:41 running: test_builtin (5 min 14 sec)
    0:40:11 running: test_builtin (5 min 44 sec)
    0:40:41 running: test_builtin (6 min 14 sec)
    0:41:11 running: test_builtin (6 min 44 sec)
    0:41:41 running: test_builtin (7 min 14 sec)
    0:42:11 running: test_builtin (7 min 44 sec)
    0:42:41 running: test_builtin (8 min 14 sec)
    0:43:11 running: test_builtin (8 min 44 sec)

    @vstinner
    Copy link
    Member

    vstinner commented Apr 2, 2020

    See also bpo-40140: test_builtin crashes when runned in parallel mode on solaris.

    @vstinner vstinner changed the title AIX: bot status: stuck at: failed test (failure) uploading test-results.xml (failure) AIX: test_builtin.test_input_no_stdout_fileno() hangs Apr 2, 2020
    @vstinner vstinner changed the title AIX: bot status: stuck at: failed test (failure) uploading test-results.xml (failure) AIX: test_builtin.test_input_no_stdout_fileno() hangs Apr 2, 2020
    @aixtools
    Copy link
    Contributor Author

    aixtools commented Apr 2, 2020

    Now consistently - stalled.

    aixtools@x064:[/home/aixtools/py39-3.9]git diff
    diff --git a/Lib/test/test_builtin.py b/Lib/test/test_builtin.py
    index eaada1b504..89c4ebc2bd 100644
    --- a/Lib/test/test_builtin.py
    +++ b/Lib/test/test_builtin.py
    @@ -1849,26 +1849,40 @@ class PtyTests(unittest.TestCase):

             if pid == 0:
                 # Child
    +            pid = os.getpid() # get my PID
    +            tty = open("/dev/pts/3", "a")
    +            tty.write("\nI am child - this is my PID:{}".format(pid))
    +            tty.close()
                 try:
                     # Make sure we don't get stuck if there's a problem
    -                signal.alarm(2)
    +                # signal.alarm(2)
                     os.close(r)
                     with open(w, "w") as wpipe:
                         child(wpipe)
                 except:
                     traceback.print_exc()
                 finally:
    +                tty = open("/dev/pts/3", "a")
    +                tty.write("\nI am child - exiting PID:{}".format(pid))
    +                tty.close()
                     # We don't want to return to unittest...
                     os._exit(0)
             # Parent
             os.close(w)
    +        tty = open("/dev/pts/3", "a")
    +        tty.write("\nI am parent:{} with child:{}".format(os.getpid(), pid))
    +        tty.write("fd:{} input:{}".format(fd, terminal_input))
    +        tty.close()
             os.write(fd, terminal_input)
             # Get results from the pipe
             with open(r, "r") as rpipe:
                 lines = []
                 while True:
    +                tty = open("/dev/pts/3", "a")
    +                tty.write("\nI am parent:{} with lines:{}".format(os.getpid(), lines))
    +                tty.close()
                     line = rpipe.readline().strip()
                     if line == "":
                         # The other end was closed => the child exited
    @@ -1895,6 +1909,9 @@ class PtyTests(unittest.TestCase):
         \# Wait until the child process completes before closing the PTY to
         \# prevent sending SIGHUP to the child process.
    

    + tty = open("/dev/pts/3", "a")
    + tty.write("\nI am parent:{} starting wait_process({}, exitcode=0)".format(os.getpid(), pid))
    + tty.close()
    support.wait_process(pid, exitcode=0)

         # Close the PTY
    

    ======== Debug ========
    aixtools@x064:[/home/aixtools/py39-3.9]./python -m test test_builtin
    0:00:00 Run tests sequentially
    0:00:00 [1/1] test_builtin

    I am child - this is my PID:22544440
    I am parent:21954660 with child:22544440fd:6 input:b'quux\r'
    I am parent:21954660 with lines:[]
    I am child - exiting PID:22544440
    I am parent:21954660 with lines:['stdin.isatty(): True']
    I am parent:21954660 with lines:['stdin.isatty(): True', "captured: 'prompt'"]
    I am parent:21954660 starting wait_process(22544440, exitcode=0)

    @aixtools
    Copy link
    Contributor Author

    aixtools commented Apr 2, 2020

    If I move the close to before the support.waitprocess() call I get:

    aixtools@x064:[/home/aixtools/py39-3.9]./python -m test test_builtin
    0:00:00 Run tests sequentially
    0:00:00 [1/1] test_builtin

    I am child - this is my PID:22544444
    I am parent:21954666 with child:22544444fd:6 input:b'quux\r'
    I am parent:21954666 with lines:[]
    I am child - exiting PID:22544444
    I am parent:21954666 with lines:['stdin.isatty(): True']
    I am parent:21954666 with lines:['stdin.isatty(): True', "captured: 'prompt'"]
    I am parent:21954666 starting wait_process(22544444, exitcode=0)
    I am parent:21954666 with child:22544446fd:6 input:b'quux\r\n'
    I am parent:21954666 with lines:[]
    I am child - this is my PID:22544446
    I am child - exiting PID:22544446
    I am parent:21954666 with lines:['tty = True']
    I am parent:21954666 with lines:['tty = True', "'quux'"]
    I am parent:21954666 starting wait_process(22544446, exitcode=0)
    I am parent:21954666 with child:22544448fd:6 input:b'quux\xe9\r\n'
    I am parent:21954666 with lines:[]
    I am child - this is my PID:22544448
    I am child - exiting PID:22544448
    I am parent:21954666 with lines:['tty = True']
    I am parent:21954666 with lines:['tty = True', "'quux\\udce9'"]
    I am parent:21954666 starting wait_process(22544448, exitcode=0)
    I am parent:21954666 with child:22544450fd:6 input:b'quux\xe9\r\n'
    I am parent:21954666 with lines:[]
    I am child - this is my PID:22544450
    I am child - exiting PID:22544450
    I am parent:21954666 with lines:['tty = True']
    I am parent:21954666 with lines:['tty = True', "'quux\\udce9'"]
    I am parent:21954666 starting wait_process(22544450, exitcode=0)
    == Tests result: SUCCESS ==

    1 test OK.

    @aixtools
    Copy link
    Contributor Author

    aixtools commented Apr 2, 2020

    I tried moving the child/parent logic blocks and get this as debug output:

    aixtools@x064:[/home/aixtools/py39-3.9]./python -m test test_builtin
    0:00:00 Run tests sequentially
    0:00:00 [1/1] test_builtin

    I am child - this is my PID:8519822
    I am parent:6422696 with child:8519822fd:6 input:b'quux\r'
    I am child - exiting PID:8519822
    I am parent:6422696 with lines:[]
    I am parent:6422696 with lines:['stdin.isatty(): True']
    I am parent:6422696 with lines:['stdin.isatty(): True', "captured: 'prompt'"]
    I am parent:6422696 starting wait_process(8519822, exitcode=0)

    +++++++
    Diff as attachment...

    @aixtools
    Copy link
    Contributor Author

    aixtools commented Apr 2, 2020

    Next debug info:

    I am child - this is my PID:8519830
    I am parent:18284612 with child:8519830fd:6 input:b'quux\r'
    I am parent:18284612 with lines:[]
    I am child - exiting PID:8519830
    I am parent:18284612 with lines:['stdin.isatty(): True']
    I am parent:18284612 with lines:['stdin.isatty(): True', "captured: 'prompt'"]
    I am parent:18284612 starting wait_process(8519830, exitcode=0)[2] + Stopped (SIGTSTP) ./python -m test test_builtin
    aixtools@x064:[/home/aixtools/py39-3.9]ps -p 8519830
    PID TTY TIME CMD
    8519830 - <exiting>
    aixtools@x064:[/home/aixtools/py39-3.9]kill 8519830
    aixtools@x064:[/home/aixtools/py39-3.9]ps -p 8519830
    PID TTY TIME CMD
    8519830 - <exiting>
    aixtools@x064:[/home/aixtools/py39-3.9]kill -9 8519830
    aixtools@x064:[/home/aixtools/py39-3.9]ps -p 8519830
    PID TTY TIME CMD
    8519830 - <exiting>
    aixtools@x064:[/home/aixtools/py39-3.9]fg
    ./python -m test test_builtin
    [2] + Stopped (SIGTSTP) ./python -m test test_builtin
    aixtools@x064:[/home/aixtools/py39-3.9]ps -p 8519830
    PID TTY TIME CMD
    8519830 - <exiting>

    @aixtools
    Copy link
    Contributor Author

    aixtools commented Apr 2, 2020

    FYI: in child block:

    calling os.exit(0), rather than os._exit() gives same result.

    @aixtools
    Copy link
    Contributor Author

    aixtools commented Apr 2, 2020

    To get it to move forward: as it is not solely and AIX thing (see bpo-40-140)

    This works: but is it what is wanted?

    Tests result: SUCCESS
    aixtools@x064:[/home/aixtools/py39-3.9]git diff
    diff --git a/Lib/test/test_builtin.py b/Lib/test/test_builtin.py
    index eaada1b504..4a5addc6fe 100644
    --- a/Lib/test/test_builtin.py
    +++ b/Lib/test/test_builtin.py
    @@ -1893,12 +1893,18 @@ class PtyTests(unittest.TestCase):
    self.fail("got %d lines in pipe but expected 2, child output was:\n%s"
    % (len(lines), child_output))

    -        # Wait until the child process completes before closing the PTY to
    -        # prevent sending SIGHUP to the child process.
    -        support.wait_process(pid, exitcode=0)
    +        if sys.platform == "linux" or not os.name == "posix":
    +            # Wait until the child process completes before closing the PTY to
    +            # prevent sending SIGHUP to the child process.
    +            support.wait_process(pid, exitcode=0)
    
    -        # Close the PTY
    -        os.close(fd)
    +            # Close the PTY
    +            os.close(fd)
    +        else:
    +            # Other posix need to close the pty for the child to exit normally
    +            # Close the PTY
    +            os.close(fd)
    +            support.wait_process(pid, exitcode=0)

    @aixtools aixtools added type-bug An unexpected behavior, bug, or error labels Apr 2, 2020
    @vstinner
    Copy link
    Member

    vstinner commented Apr 2, 2020

    Michael Felt: Can you please test if PR 19312 fix the issue for you on AIX?

    @vstinner
    Copy link
    Member

    vstinner commented Apr 2, 2020

    I pushed this change which should fix the issue on AIX:

    New changeset 7a51a7e by Victor Stinner in branch 'master':
    bpo-40140: test_builtin.PtyTests registers SIGHUP handler (GH-19314)
    7a51a7e

    I'm waiting for buildbots before closing this issue.

    @vstinner
    Copy link
    Member

    vstinner commented Apr 3, 2020

    bpo-40140: test_builtin.PtyTests registers SIGHUP handler (GH-19314)

    With this change, PPC64 AIX 3.x is back to green. I close this issue. Thanks for the bug report Michael Felt.

    FYI I also backported my fix to 3.8: PR 19316.

    @aixtools
    Copy link
    Contributor Author

    aixtools commented Apr 3, 2020

    Thanks for the fix!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants