New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration tests for pty.spawn on Linux and all other platforms #73256
Comments
As Martin Panter noted, "it doesn’t look like pty.spawn() is tested at all" [bpo-26228]. So I wrote some very basic integration tests for pty.spawn. They work perfectly on my Linux machine. Line 4 of the library module under test (pty.py) states: "Only tested on Linux." I don't like this comment ;-) There are two possibilities how to proceed from here:
In either case, I need some help by non-Linux users to run theses tests and document their success/failure and whether we have a bug on the specific platform (i.e. pty.spawn should work but does not) or whether the tests should be skipped (i.e. platform does not support ptys at all). It would be really awesome if some *BSD and Windows users would join the nosy list :-) Happy Holidays! |
I did a larger update to my proposed patch. I read the pty man pages and posix standard and designed a test suite which documents the expected behavior of the pty module. The test suite is biased towards my Linux system, but I respect the posix standard. Tested on Linux, OS X, and FreeBSD. The patch introduces no behavioral changes to the actual pty.py module! It should be easy to merge. Changes to pty.py: I made _select and _execlp local definitions of the pty module to ease mocking them in the test suite. A comment finally explains why pty.py defines STDIN_FILENO, STDOUT_FILENO, ... itself. All tests pass on Linux. I included the change to the documentation I proposed in bpo-29054. The test suite does several things:
Waiting for a review :-) |
Hi Cornelius and thanks for the work. Since the patch adds a significant amount of code, I think you might have to sign the contributor agreement: <http://www.python.org/psf/contrib/contrib-form/\>. You can do it in a web browser. I would like to review your tests, but because there is a lot of code to understand and I don’t have much time, it might take me a while. If you can find any way to simplify it, that would be great. I have some comments on the code review, and will probably post more as I begin to understand what you are proposing. It looks like you depend on fixing bpo-26228, but the patch there will conflict with your changes. Maybe merge with the other patch, or propose an alternative fix. The documentation currently mentions the code is only tested on Linux, so it would be nice to update that. The patch does introduce behavioural changes, if you consider abuse like monkey-patching os.exec() after importing the pty module. It is best not to make unnecessary changes in a bug fix. Why does the patch slow the tests down so much? Ideally, it is nice to keep the tests as fast as possible. What is the problem with using a genuine exec() call? Why do we need PtyMockingExecTestBase? |
Ignore my comment about contrib agreement, that must have come through recently :) |
Thank you Martin very much for this very helpful review. I updated and simplified the tests and implemented your suggestions. There are three open issues left.
I reverted my changes to pty.py, so the patches no longer conflict for this file. Currently, my test suite reveals the pty bug on FreeBSD but does not fix it. Since I'm not a FreeBSD expert and I don't want to steal Chris' patch [bpo-26228], I'd like to keep those issues separate. In other words, I'm proposing only a testsuite which introduces no behavioral changes and uncovers a bug. This also means that this test suite is currently hanging on FreeBSD and OS X. The good news is, the patch in bpo-26228 fixes the problem.
The test suite does some heavy integration testing. The test in the class PtyWhiteBoxIntegrationReadSlaveTest are the slow ones. The test suite reads a large amount of data from the spawned child process. The amount of data was chosen to be a few kB in size to make sure that we won't lose data of the child and that --in case of a bug in the code-- we don't accidentally succeed due to race conditions. Is 1-2 seconds really too slow? We can reduce the amount of data, but increase the risk that a future code change introduces a race condition which is not covered by the test suite. About the system bell pretty printing test:
Including these integration tests has advantages and disadvantages. Advantages:
Disadvantages:
I wanted to include some tests which depend on some terminal-specific features to make sure that the complete setup works and is operational. For example, the test suite now documents the difference between ptys and pipes and would complain if any code change breaks terminal-specific features. I chose control character pretty printing in some tests because this is one of the simplest features of terminals to test, debug, and verify. In contrast, for example, it would be extremely fragile and probably complete overkill to test things such as terminal delays. My personal feeling says that the advantages outweigh the disadvantages. If you disagree, we can remove them :-) |
I would prefer to commit Chris’s fix for BSDs (bpo-26228) with a regression test. I can explain in the commit message who contributed to which part, or do two separate commits if you prefer. But the point is to add the test with the fix. I’m not going to commit the test on its own, because we know it will fail. Even in Lib/test/test_pty.py, it looks like the two patches will collide. I was hoping you could combine the patches, or supply a set of patches that are tested and compatible. I don’t have OS X or BSD so I have to rely on you and the buildbots to confirm that a patch doesn’t break any tests. I haven’t looked too closely regarding the slow tests, but another option is use the “cpu” resource to mark the test as being slow. I realized that PtyWhiteBoxIntegrationTermiosTest is a reasonable test for the termios module, so it is beneficial. Though it may have been simpler to write it using pty.openpty(), avoiding an extra thread that just copies data between other threads. |
Dear Martin, now I understand your intention. I merged my test suite with Chris's fix and documented our insights. SmallPtyTests contains regression tests for this issue. While testing, I found a subtle change in behavior introduced by Chris's patch: It is no longer possible to get a broken pipe easily. I made a small adjustment to preserve this existing behavior: In the _copy loop, I changed the order in which master_fd and STDIN_FILENO are copied to preserve the BrokenPipeError. I tuned the slow tests, the complete test suite now needs less than 1 second on my system. I updated the documentation to note that the module is no longer supposed to be Linux-only.
I agree completely. However, pty.openpty() does not depend on the _copy() loop which is modified by Chris's patch. To add higher test coverage for the changes we introduce by merging Chris's fix, I decided to stick to pty.spawn(). I tested on Linux 4.4, Linux 3.13, FreeBSD 11, MacOS 10.11.6 and all tests are green. If anything goes wrong on the testbot army, here is the fallback strategy: |
Can you explain your broken pipe situation? Are you talking about a real-world EPIPE operating on a pseudoterminal, or just a result of using a Unix socket to emulate a PTY in the tests? Usually a broken pipe is an asynchronous condition. You cannot predict exactly when it will happen without knowing the state of the other end, OS implementation, buffering, etc. It does not seem appropriate to change the _copy() loop around unless there is an real bug. Regarding the buildbots, if I get this patch into a state that I am comfortable committing, the buildbots will are generally set to timeout in 15 or 20 minutes. See the “make buildbottest” commands in Makefile.pre.in, test.regrtest --timeout option, etc. You can also see which platforms have buildbots, and the state of them etc: <https://www.python.org/dev/buildbot/\>. I left a bunch of comments in the code review. There is a lot of useful code in there, but also a lot of stuff that is hard to follow or that I want to clean up. |
I have left some comments on Rietveld after a partial code review. IMO for reasons of repeatability and easier maintenance [1], all the tests should be run with the pty set in a known mode, whether canonical or raw, and a test involving pty.spawn() should ensure that the chosen mode will be set by this function, or should skip it otherwise. [1] For example when investigating the random failures in PtyTest reported by Cornelius in the patch comments, one has to wonder: "Oh, maybe it fails on this foreign system because the pty is not in raw mode there". |
Uploaded a new version of the patch. Changelog of this patch (compared to v3):
I also added some small comments in the code review tool. |
Thank you Martin for your comments in the code review tool. I prepared a new patch for the code review tool. The github changelog from patch v4 (Feb 2017) to my HEAD (currently patch v5, Apr 2017) is at: https://github.com/python/cpython/compare/master...diekmann:wip-issue-29070?expand=1 I hope that having several micro commits instead of one huge patch is helpful. In the end, it will be squashed anyway. Do you prefer to continue with bpo patches or have a github pull request? |
Hi! Can anyone please take a look at https://bugs.python.org/issue41494 [ https://github.com//pull/21752 ]? I think these are related. I wrote a new function called wspawn, which is like spawn+the following differences.
|
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: