classification
Title: test_readline hangs
Type: behavior Stage: test needed
Components: Tests Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: martin.panter, wmayner
Priority: normal Keywords:

Created on 2017-05-17 21:37 by wmayner, last changed 2017-07-09 23:17 by martin.panter.

Messages (4)
msg293888 - (view) Author: (wmayner) Date: 2017-05-17 21:37
I'm trying to build Python 3.6.1 from source with the `--enable-optimizations` configuration option. As I understand it, this entails running and profiling all the tests for PGO. When `test_readline` is run, it hangs.

OS: Debian 3.16.43-2 (2017-04-30) x86_64 GNU/Linux

Please let me know what other information I should provide.
msg294574 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-05-27 02:10
Perhaps if you ran the tests in verbose mode, that could narrow down which test it hangs at. Also, if you can capture a KeyboardInterrupt stack trace, that may help. Not sure if it works with your build setup, but maybe you can run something like

./python -m unittest -v test.test_readline

Not too long ago, I added some tests that run a child process in a pseudoterminal (see the “run_pty” function). There were a few platform specific problems getting that working, so there is a good chance your problem may be related.

I wonder if test hangs when you don’t use “--enable-optimizations”?
msg294765 - (view) Author: (wmayner) Date: 2017-05-30 15:58
Output of `./python3 -m unittest -v test.test_readline`:

testHistoryUpdates (test.test_readline.TestHistoryManipulation) ... ok
test_nonascii_history (test.test_readline.TestHistoryManipulation) ... ok
test_write_read_append (test.test_readline.TestHistoryManipulation) ... ok
test_auto_history_disabled (test.test_readline.TestReadline) ...

I then found by commenting-out tests that it hangs if and only if `run_pty` is called, so it seems that your hunch is probably right.

I compiled without `--enable-optimizations` and the same tests hang. Compiling with that option runs the whole test suite, which revealed the problem, but it doesn't seem to be related.

Here's the KeyboardInterrupt trace:

estHistoryUpdates (test.test_readline.TestHistoryManipulation) ... ok
test_nonascii_history (test.test_readline.TestHistoryManipulation) ... ok
test_write_read_append (test.test_readline.TestHistoryManipulation) ... ok
test_auto_history_disabled (test.test_readline.TestReadline) ... ^CTraceback (most recent call last):
  File "/home/will/python3.6.1/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/will/python3.6.1/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/will/python3.6.1/lib/python3.6/unittest/__main__.py", line 18, in <module>
    main(module=None)
  File "/home/will/python3.6.1/lib/python3.6/unittest/main.py", line 94, in __init__
    self.runTests()
  File "/home/will/python3.6.1/lib/python3.6/unittest/main.py", line 255, in runTests
    self.result = testRunner.run(self.test)
  File "/home/will/python3.6.1/lib/python3.6/unittest/runner.py", line 176, in run
    test(result)
  File "/home/will/python3.6.1/lib/python3.6/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/home/will/python3.6.1/lib/python3.6/unittest/suite.py", line 122, in run
    test(result)
  File "/home/will/python3.6.1/lib/python3.6/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/home/will/python3.6.1/lib/python3.6/unittest/suite.py", line 122, in run
    test(result)
  File "/home/will/python3.6.1/lib/python3.6/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/home/will/python3.6.1/lib/python3.6/unittest/suite.py", line 122, in run
    test(result)
  File "/home/will/python3.6.1/lib/python3.6/unittest/case.py", line 649, in __call__
    return self.run(*args, **kwds)
  File "/home/will/python3.6.1/lib/python3.6/unittest/case.py", line 601, in run
    testMethod()
  File "/home/will/python3.6.1/lib/python3.6/test/test_readline.py", line 136, in test_auto_history_disabled
    output = run_pty(self.auto_history_script.format(False))
  File "/home/will/python3.6.1/lib/python3.6/test/test_readline.py", line 240, in run_pty
    for [_, events] in sel.select():
  File "/home/will/python3.6.1/lib/python3.6/selectors.py", line 323, in select
    r, w, _ = self._select(self._readers, self._writers, [], timeout)
KeyboardInterrupt
msg298005 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-07-09 23:17
Thanks for the information. That narrows the problem down, but I still don’t exactly know why it hangs.

A good workaround may be to put a timeout in the “select” call for a second or so, and if it times out, raise an exception which will fail the test but let the rest of the tests continue. To add the timeout, I would change the code in “run_pty” to look like

selected = sel.select(timeout=1)
if not selected:
    raise Exception("Child timed out: remaining input {!r}, output {!r}".format(
        input, output))
for [_, events] in selected:
    ...

This is what “test_auto_history_enabled” is supposed to do:

1. Create a pseudo-terminal with master and slave file descriptors
2. Spawn a child process to read from the slave and then print the test result to the slave
3. Parent waits to be able to read from or write to the PTY master
4. Parent writes the bytes b"dummy input\r" to the PTY
    5. Child calls “input” which calls the Readline library and reads the input
    6. Child writes the test result to the PTY slave
7. Parent reads the output from the child
    8. Child closes its slave file descriptor and exits
9. On Linux, reading then raises EIO indicating all copies of the slave are closed and there is no more output left to read

The parent is hanging at step 3, before entering step 4, 7, or 9. Some input may already have been written to the child, and some output may have been captured, but it assumes the child is still running and is waiting for it to exit, which should close the slave file descriptor.

It would be interesting to confirm if the child has exited or is still waiting for the CR to terminate its input line. Or perhaps the slave file descriptor has somehow been duplicated in a third process without being closed.
History
Date User Action Args
2017-07-09 23:17:12martin.pantersetmessages: + msg298005
2017-05-30 15:58:47wmaynersetmessages: + msg294765
2017-05-27 02:10:19martin.pantersetnosy: + martin.panter

messages: + msg294574
stage: test needed
2017-05-17 21:37:35wmaynercreate