classification
Title: test_pty: test_basic() fails randomly on Travis CI
Type: Stage: resolved
Components: Tests Versions: Python 3.7, Python 3.6, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Cornelius Diekmann, cstratak, martin.panter, vstinner, xdegaye
Priority: normal Keywords: patch

Created on 2017-08-09 10:52 by vstinner, last changed 2017-10-02 09:59 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 3037 closed vstinner, 2017-08-09 11:01
PR 3802 closed vstinner, 2017-09-28 14:29
PR 3808 merged Cornelius Diekmann, 2017-09-28 16:34
PR 3852 merged python-dev, 2017-10-02 09:40
PR 3853 merged vstinner, 2017-10-02 09:47
Messages (21)
msg299987 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-08-09 10:52
https://travis-ci.org/python/cpython/jobs/262606831

0:13:26 load avg: 134.81 [256/403] test_pty
test test_pty failed -- Traceback (most recent call last):
  File "/home/travis/build/python/cpython/Lib/test/test_pty.py", line 109, in test_basic
    self.assertEqual(b'For my pet fish, Eric.\n', normalize_output(s2))
AssertionError: b'For my pet fish, Eric.\n' != b'For m'
msg299988 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-08-09 11:02
I don't know why test_pty fails.

https://github.com/python/cpython/pull/3037 makes sure that all data is written into the master fd using a new write_all() helper function which loops until all data is written: handle os.write() partial write.
msg299991 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-08-09 11:55
Checking for short writes is worthwhile, but in Issue 29070 it looks like Cornelius identified the main problem was short _reads_. See the parts of his patch to do with “_os_read_exactly” and related functions.
msg299995 - (view) Author: Charalampos Stratakis (cstratak) * Date: 2017-08-09 12:24
I've observed the exact same issue on Fedora at random times.
msg300067 - (view) Author: Cornelius Diekmann (Cornelius Diekmann) * Date: 2017-08-10 09:28
I observed the same issue, but the problem occurs mainly when reading data. In my proposed patch in issue29070, I use the existing pty._writen() to make sure all data is written. As Martin mentioned, reading is a problem. My patch proposes _os_timeout_read(), _os_readline(), _os_read_exactly(), _os_read_exhaust_exactly() to give fine-grained, deterministic control. Cheeky advertisement: Anybody cares to review issue29070 or cherry pick the pty._writen() and _os_read* parts? :)
msg303199 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-28 09:06
Failure on x86 Gentoo Refleaks 3.x too:

http://buildbot.python.org/all/builders/x86%20Gentoo%20Refleaks%203.x/builds/104/steps/test/logs/stdio

======================================================================
FAIL: test_basic (test.test_pty.PtyTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/buildbot/buildarea/3.x.ware-gentoo-x86.refleak/build/Lib/test/test_pty.py", line 105, in test_basic
    self.assertEqual(b'For my pet fish, Eric.\n', normalize_output(s2))
AssertionError: b'For my pet fish, Eric.\n' != b'For m'
msg303241 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-28 14:28
Oh, in fact, it's not that hard to reproduce the failure on Linux:

haypo@selma$ ./python -m test test_pty -m test_basic -F 
Run tests sequentially
0:00:00 load avg: 2.17 [  1] test_pty
0:00:00 load avg: 2.17 [  2] test_pty
0:00:00 load avg: 2.17 [  3] test_pty
(...)
0:00:00 load avg: 2.17 [ 15] test_pty
0:00:00 load avg: 2.17 [ 16] test_pty
test test_pty failed -- Traceback (most recent call last):
  File "/home/haypo/prog/python/master/Lib/test/test_pty.py", line 105, in test_basic
    self.assertEqual(b'For my pet fish, Eric.\n', normalize_output(s2))
AssertionError: b'For my pet fish, Eric.\n' != b'For m'

test_pty failed
15 tests OK.

1 test failed:
    test_pty

Total duration: 486 ms
Tests result: FAILURE
msg303249 - (view) Author: Cornelius Diekmann (Cornelius Diekmann) * Date: 2017-09-28 14:50
Regarding PR 3802:
 * Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too?
 * Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed?


I also fixed these issues in one of my (unmerged) pull requests (bpo issue29070). Unfortunately, the patch I propose does way too many things at the same time and I don't have time currently to untangle it :( Feel free to cherry pick the _os_readline function:
https://github.com/python/cpython/pull/2932/files#diff-6c897ae50dce4ffa73bde779206a2059R60
Don't hesitate to put your name on it, if you take the time to review and cherry pick :)

Cheers!
msg303252 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-28 15:00
> Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too?

This string TEST_STRING_1 is used for a single os.write() call, whereas TEST_STRING_2 is splitted and written in two parts with two os.write() calls.

I prefer minimal change, since I don't know well the pty module.

> Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed?

Yeah, I looked at this function. normalize_output() can return a string shorter than the input: len(normalize_output(s2)) <= len(s2).

So I think that len(s2) < len(expected) is correct.
msg303257 - (view) Author: Cornelius Diekmann (Cornelius Diekmann) * Date: 2017-09-28 15:47
>> Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too?
>
>This string TEST_STRING_1 is used for a single os.write() call, whereas TEST_STRING_2 is splitted and written in two parts with two os.write() calls.
>
>I prefer minimal change, since I don't know well the pty module.

I like the idea of minimal change, too :D

Yet, I think your patch does not solve the core problem of read/write being nondeterministic. In theory, a pty is similar to a pipe (with termios processing in the middle). So any os.write to a pty fd is nondeterministic and may put less bytes into the pty buffer than given to the write call (see the return value of os.write, which test_pty.py does not check). Multiple writes are buffered by the kernel, until the buffer is full. So the kernel already accumulates the chunked writing for us. Usually, this works fine.

Similarly, a os.read may also be nondeterministic, depending on how many bytes are ready in the pty buffer. This may have nothing to do with the chunked writing of the test_pty.py module because the kernel is doing the read/write syscalls and handling the pty buffer.

Here is a PoC: I checked out your code and stressed my system with it. I have 2 physical and 2 virtual cores and started 8 instances of the test to stress my kernel with scheduling, locking of kernel buffers (the pty buffer), and making read/write more nondeterministic.

./python -b -m test -j 8 test_pty -m test_basic -F

Here is what I got:
0:00:13 load avg: 3.25 [119/1] test_pty failed
test test_pty failed -- Traceback (most recent call last):
  File "XXX/cpython/Lib/test/test_pty.py", line 99, in test_basic
    normalize_output(s1))
AssertionError: b'I wish to buy a fish license.\n' != b'I wish to buy a fish license.'

>> Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed?
>
>Yeah, I looked at this function. normalize_output() can return a string shorter than the input: len(normalize_output(s2)) <= len(s2).
>
>So I think that len(s2) < len(expected) is correct.

Warning, obscure corner cases ahead.
In theory, given read is completely nondeterministic and we are on one of the strange systems which need normalize_output, the following could happen:
We have b'For my pet fish, Eric.\r\n' in the pty buffer. We read b'For my pet fish, Eric.\r' from the buffer into s2. Now len(s2) == len(expected) but a b'\n' is still unread in the buffer. This would make the test fail. I admit, this is a corner case, but also an argument that a clean test case may want to have a readline function.
msg303259 - (view) Author: Cornelius Diekmann (Cornelius Diekmann) * Date: 2017-09-28 16:02
Oh, one more thing, since it is unclear whether read or write is causing the error. I replaced
os.write(slave_fd, TEST_STRING_1)
by
pty._writen(slave_fd, TEST_STRING_1)
which makes sure all bytes are written.
After some time, when the load gets high and the machine gets noisy, I get the same error:

0:01:09 load avg: 6.17 [597/1] test_pty failed
test test_pty failed -- Traceback (most recent call last):
  File "XXX/cpython/Lib/test/test_pty.py", line 99, in test_basic
    normalize_output(s1))
AssertionError: b'I wish to buy a fish license.\n' != b'I wish to buy a fish license.'

So the main problem is a nondeterministic read. But I'm only testing on Linux. Other systems with different kernel may behave differently. And the documentation allows a kernel to make both short reads and writes.
msg303261 - (view) Author: Cornelius Diekmann (Cornelius Diekmann) * Date: 2017-09-28 16:38
I also added a pull request which I cannot get to fail on my system. What do you think, haypo?

Sorry for the noise!
msg303275 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-28 20:18
Cornelius:Here is a PoC: I checked out your code and stressed my system with it. (...) AssertionError: ..."

Did you get this failure with my PR 3802?

Cornelius: "We have b'For my pet fish, Eric.\r\n' in the pty buffer. We read b'For my pet fish, Eric.\r' from the buffer into s2. Now len(s2) == len(expected) but a b'\n' is still unread in the buffer. This would make the test fail. I admit, this is a corner case, but also an argument that a clean test case may want to have a readline function."

Right. I can just modify my condition from "len(s2) < len(expected)" to "b'\n' not in s2".

Cornelius: "I also added a pull request which I cannot get to fail on my system. What do you think, haypo?"

Your code calls read(1) in a loop until it gets the newline character b'\n'.

Is it better to os.read(1024) in a loop until the output string contains b'\n'?

I have no strong preference.
msg303276 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-09-28 20:19
Hum, let me clarify: my main goal is to get a stable CI. I don't really care of the exact implementation of one specific unit test in test_pty ;-)
msg303286 - (view) Author: Cornelius Diekmann (Cornelius Diekmann) * Date: 2017-09-28 22:27
Yes, I get the AssertionError with the latest version of PR 3802. From the high load avg of my system, you can see that the error occurs very rarely and that I need to stress my system to trigger it. With PR 3802, the error occurs way less frequently than without it. So we are definitely moving into the right direction. With PR 3808, I could not trigger any error (on my system).

Changing the conditions to "b'\n' not in s2" should work. But we actually mean to read a line, so this should be better reflected in the code. I prefer calling a readline() function directly, which is almost self-documenting code.

> Your code calls read(1) in a loop until it gets the newline character b'\n'.
> Is it better to os.read(1024) in a loop until the output string contains b'\n'?
Behavior may be different if there are multiple short lines in the buffer (which should not be the case in the unit test, but this may be a problem if somebody copies the code and uses it somewhere else).
pitrou in the discussion of PR 3808 suggests the ultimate answer to the question: Just use an existing readline function from the library :-)
I added this to PR 3808.

Personal thought: I care about good code in the unit tests because people might look at this as reference how to use a module and copy&paste from it. I want the tests to be deterministic, which---as long as the tests pass---implies a stable CI ;-)
msg303397 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2017-09-30 07:03
I prefer Cornelius’s current proposal (rev 4f8137b) because it fixes both sites, rather than just patching the immediate problem site.

I don’t think read(1) is a big problem, just less efficient. But if you prefer to do larger reads, that should be fine too. You could even use os.fdopen(...).readline(), which would use BufferedReader. It is not documented, but BufferedReader should do large reads and return the line without waiting to fill its internal buffer.

FWIW I think it would be okay to remove the “chunked output” test. It isn’t exercising the “pty” module any more than the “Writing to slave_fd” test above it. All we need to do is verify that the “master_open” function returns a PTY master and that “slave_open” returns the corresponding slave terminal file descriptor. Checking “isatty” and sending one line of data through seems sufficient.
msg303518 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-02 09:39
New changeset e6f62f69f07892b993910ff03c9db3ffa5cb9ca5 by Victor Stinner (Cornelius Diekmann) in branch 'master':
bpo-31158: Fix nondeterministic read in test_pty (#3808)
https://github.com/python/cpython/commit/e6f62f69f07892b993910ff03c9db3ffa5cb9ca5
msg303519 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-02 09:48
> I prefer Cornelius’s current proposal (rev 4f8137b)

Ok, fine. I merged his PR 3852. I created backports to 2.7 and 3.6.
msg303522 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-02 09:57
New changeset 66fb5ef3bb9e36187a0e5052dfd99899447df671 by Victor Stinner (Miss Islington (bot)) in branch '3.6':
[3.6] bpo-31158: Fix nondeterministic read in test_pty (GH-3808) (GH-3852)
https://github.com/python/cpython/commit/66fb5ef3bb9e36187a0e5052dfd99899447df671
msg303523 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-02 09:58
New changeset 20cbc1d29facead32c2da06c81a09af0e057015c by Victor Stinner in branch '2.7':
bpo-31158: Fix nondeterministic read in test_pty (#3808) (#3853)
https://github.com/python/cpython/commit/20cbc1d29facead32c2da06c81a09af0e057015c
msg303524 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-10-02 09:59
Thanks Cornelius Diekmann for your fix ;-)
History
Date User Action Args
2017-10-02 09:59:15vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg303524

stage: patch review -> resolved
2017-10-02 09:58:11vstinnersetmessages: + msg303523
2017-10-02 09:57:58vstinnersetmessages: + msg303522
2017-10-02 09:48:52vstinnersetmessages: + msg303519
versions: + Python 2.7, Python 3.6
2017-10-02 09:47:28vstinnersetpull_requests: + pull_request3833
2017-10-02 09:40:50python-devsetpull_requests: + pull_request3832
2017-10-02 09:39:58vstinnersetmessages: + msg303518
2017-09-30 07:03:11martin.pantersetmessages: + msg303397
2017-09-28 22:27:35Cornelius Diekmannsetmessages: + msg303286
2017-09-28 20:19:21vstinnersetmessages: + msg303276
2017-09-28 20:18:31vstinnersetmessages: + msg303275
2017-09-28 16:38:11Cornelius Diekmannsetmessages: + msg303261
2017-09-28 16:34:03Cornelius Diekmannsetpull_requests: + pull_request3792
2017-09-28 16:02:16Cornelius Diekmannsetmessages: + msg303259
2017-09-28 15:47:43Cornelius Diekmannsetmessages: + msg303257
2017-09-28 15:00:16vstinnersetmessages: + msg303252
2017-09-28 14:50:10Cornelius Diekmannsetmessages: + msg303249
2017-09-28 14:29:58vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request3788
2017-09-28 14:28:02vstinnersetmessages: + msg303241
2017-09-28 09:06:41vstinnersetmessages: + msg303199
2017-08-13 20:41:20xdegayesetnosy: + xdegaye
2017-08-10 09:28:59Cornelius Diekmannsetmessages: + msg300067
2017-08-09 12:24:31cstrataksetnosy: + cstratak
messages: + msg299995
2017-08-09 11:55:50martin.pantersetnosy: + Cornelius Diekmann, martin.panter
messages: + msg299991
2017-08-09 11:02:16vstinnersetmessages: + msg299988
2017-08-09 11:01:18vstinnersetpull_requests: + pull_request3070
2017-08-09 10:52:13vstinnercreate