Message 303257 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Cornelius Diekmann
Recipients	Cornelius Diekmann, cstratak, martin.panter, vstinner, xdegaye
Date	2017-09-28.15:47:43
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1506613663.9.0.466225441844.issue31158@psf.upfronthosting.co.za>
In-reply-to

Content
>> Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too? > >This string TEST_STRING_1 is used for a single os.write() call, whereas TEST_STRING_2 is splitted and written in two parts with two os.write() calls. > >I prefer minimal change, since I don't know well the pty module. I like the idea of minimal change, too :D Yet, I think your patch does not solve the core problem of read/write being nondeterministic. In theory, a pty is similar to a pipe (with termios processing in the middle). So any os.write to a pty fd is nondeterministic and may put less bytes into the pty buffer than given to the write call (see the return value of os.write, which test_pty.py does not check). Multiple writes are buffered by the kernel, until the buffer is full. So the kernel already accumulates the chunked writing for us. Usually, this works fine. Similarly, a os.read may also be nondeterministic, depending on how many bytes are ready in the pty buffer. This may have nothing to do with the chunked writing of the test_pty.py module because the kernel is doing the read/write syscalls and handling the pty buffer. Here is a PoC: I checked out your code and stressed my system with it. I have 2 physical and 2 virtual cores and started 8 instances of the test to stress my kernel with scheduling, locking of kernel buffers (the pty buffer), and making read/write more nondeterministic. ./python -b -m test -j 8 test_pty -m test_basic -F Here is what I got: 0:00:13 load avg: 3.25 [119/1] test_pty failed test test_pty failed -- Traceback (most recent call last): File "XXX/cpython/Lib/test/test_pty.py", line 99, in test_basic normalize_output(s1)) AssertionError: b'I wish to buy a fish license.\n' != b'I wish to buy a fish license.' >> Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed? > >Yeah, I looked at this function. normalize_output() can return a string shorter than the input: len(normalize_output(s2)) <= len(s2). > >So I think that len(s2) < len(expected) is correct. Warning, obscure corner cases ahead. In theory, given read is completely nondeterministic and we are on one of the strange systems which need normalize_output, the following could happen: We have b'For my pet fish, Eric.\r\n' in the pty buffer. We read b'For my pet fish, Eric.\r' from the buffer into s2. Now len(s2) == len(expected) but a b'\n' is still unread in the buffer. This would make the test fail. I admit, this is a corner case, but also an argument that a clean test case may want to have a readline function.

>> Does reading the string b'I wish to buy a fish license.\n' not cause a problem, too?
>
>This string TEST_STRING_1 is used for a single os.write() call, whereas TEST_STRING_2 is splitted and written in two parts with two os.write() calls.
>
>I prefer minimal change, since I don't know well the pty module.

I like the idea of minimal change, too :D

Yet, I think your patch does not solve the core problem of read/write being nondeterministic. In theory, a pty is similar to a pipe (with termios processing in the middle). So any os.write to a pty fd is nondeterministic and may put less bytes into the pty buffer than given to the write call (see the return value of os.write, which test_pty.py does not check). Multiple writes are buffered by the kernel, until the buffer is full. So the kernel already accumulates the chunked writing for us. Usually, this works fine.

Similarly, a os.read may also be nondeterministic, depending on how many bytes are ready in the pty buffer. This may have nothing to do with the chunked writing of the test_pty.py module because the kernel is doing the read/write syscalls and handling the pty buffer.

Here is a PoC: I checked out your code and stressed my system with it. I have 2 physical and 2 virtual cores and started 8 instances of the test to stress my kernel with scheduling, locking of kernel buffers (the pty buffer), and making read/write more nondeterministic.

./python -b -m test -j 8 test_pty -m test_basic -F

Here is what I got:
0:00:13 load avg: 3.25 [119/1] test_pty failed
test test_pty failed -- Traceback (most recent call last):
  File "XXX/cpython/Lib/test/test_pty.py", line 99, in test_basic
    normalize_output(s1))
AssertionError: b'I wish to buy a fish license.\n' != b'I wish to buy a fish license.'

>> Is reading len(expected) bytes the correct behavior for systems where normalize_output is needed?
>
>Yeah, I looked at this function. normalize_output() can return a string shorter than the input: len(normalize_output(s2)) <= len(s2).
>
>So I think that len(s2) < len(expected) is correct.

Warning, obscure corner cases ahead.
In theory, given read is completely nondeterministic and we are on one of the strange systems which need normalize_output, the following could happen:
We have b'For my pet fish, Eric.\r\n' in the pty buffer. We read b'For my pet fish, Eric.\r' from the buffer into s2. Now len(s2) == len(expected) but a b'\n' is still unread in the buffer. This would make the test fail. I admit, this is a corner case, but also an argument that a clean test case may want to have a readline function.

History
Date	User	Action	Args
2017-09-28 15:47:43	Cornelius Diekmann	set	recipients: + Cornelius Diekmann, vstinner, xdegaye, martin.panter, cstratak
2017-09-28 15:47:43	Cornelius Diekmann	set	messageid: <1506613663.9.0.466225441844.issue31158@psf.upfronthosting.co.za>
2017-09-28 15:47:43	Cornelius Diekmann	link	issue31158 messages
2017-09-28 15:47:43	Cornelius Diekmann	create