classification
Title: test_pty fails when using setsid()
Type: Stage: resolved
Components: Tests Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: db3l, miss-islington, pablogsal, vstinner
Priority: normal Keywords: patch

Created on 2019-10-21 11:38 by vstinner, last changed 2019-12-09 11:15 by miss-islington. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 17519 merged vstinner, 2019-12-09 10:34
PR 17520 merged miss-islington, 2019-12-09 10:57
PR 17521 merged miss-islington, 2019-12-09 10:57
Messages (12)
msg355059 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-21 11:38
regrtest has been modified in bpo-38502 to use setsid() when using multiprocessing mode (-jN command line option).

Problem: David Bolen identified that test_pty started to fail on his bolen-ubuntu worker (Ubuntu 18.04.3) since my commit ecb035cd14c11521276343397151929a94018a22.

https://buildbot.python.org/all/#/builders/141/builds/2679

0:19:05 load avg: 1.81 [234/419/1] test_pty crashed (Exit code -1) -- running: test_unicodedata (55.5 sec)

I can reproduce the issue locally:

---
$ ./python -m test -j2 test_pty -v
== CPython 3.9.0a0 (heads/urlparse_ipv6:cc733a8cb6, Oct 21 2019, 11:34:36) [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)]
== Linux-5.2.18-200.fc30.x86_64-x86_64-with-glibc2.29 little-endian
== cwd: /home/vstinner/python/master/build/test_python_20242
== CPU count: 8
== encodings: locale=UTF-8, FS=utf-8
0:00:00 load avg: 0.70 Run tests in parallel using 2 child processes
0:00:00 load avg: 0.70 [1/1/1] test_pty crashed (Exit code -1)
test_basic (test.test_pty.PtyTest) ...

== Tests result: FAILURE ==

1 test failed:
    test_pty

Total duration: 383 ms
Tests result: FAILURE
---

It's surprising that there is no output!

I would prefer to keep process groups in regrtest, it's really helpful to be able to kill all processes spawned by a test worker process.

I'm not sure how/why PTY depends is incompatible with setsid().
msg357402 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-11-24 17:55
This also happens when running the test suite with high parallelism:

./python -m test -j 20

This fails with:


== Tests result: FAILURE ==

398 tests OK.

2 tests failed:
    test_embed test_pty
msg357403 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-11-24 17:58
Indeed, almost all buildbots have to repeat the test_pty.

I think this needs to be fixed or process groups in regrtest should be limited or reverted.
msg357414 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-24 22:16
> I think this needs to be fixed or process groups in regrtest should be limited or reverted.

What do you mean by "limited"?

Process groups really help to prevent to leak grandchild processes in multiprocessing tests, when tests are interrupted manually by CTRL+C or by a timeout (sadly, only when the timeout is handled by regrtest, not when it's handled by faulthandler).

See bpo-38502 for the rationale.
msg357415 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-11-24 22:46
> What do you mean by "limited"?

I mean to deactivate it by default and make opt-in.

> Process groups really help to prevent to leak grandchild processes in multiprocessing tests, when tests are interrupted manually by CTRL+C or by a timeout (sadly, only when the timeout is handled by regrtest, not when it's handled by faulthandler).

I love process groups and they are awesome. But having these test being re-run on every buildbot and failing on my machine when just running test with -j is very annoying.
msg357416 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-24 22:50
Can't we fix test_pty?
msg357419 - (view) Author: David Bolen (db3l) Date: 2019-11-24 23:27
I think fixing the underlying pty issue should certainly be the goal, but the question is whether the process group change should remain active in the meantime, as its presence is causing a regression in the tests.  I think such cases in the past are usually rolled back, right?

I was originally on the fence since process groups address a real problem, especially in interactive testing, while creating an arguably aesthetic issue for my case of the buildbots (a warning rather than failure).

But Pablo's point about a normal manual full test run failing (not a warning as with the buildbots) feels persuasive since that's probably as common as the issue being addressed by the change.  Even if pre-existing, the pty failure is exposed by the process group change, so it might be best for the process group change to wait on fixing the pty issue.

I don't know how to weigh the relative impact though, e.g,. how many people are likely to run into each failure case.  There's probably more people doing a normal test run than breaking out of such tests though.  At the least, it's a worst impact than just the warnings on the buildbots.

Perhaps an intermediate fallback could be gating the process group change behind a regrtest option (opt-in) which could then preserve its benefits upon request, without negatively impacting the default test process, whether manual or on the buildbots.

At least until resource is available to resolve the pty issue.
msg358065 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-12-09 10:36
> I think fixing the underlying pty issue should certainly be the goal (...)

I wrote PR 17519 which fix the bug. We just have to ignore SIGHUP signal.
msg358066 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-12-09 10:57
New changeset a1838ec2592e5082c75c77888f2a7a3eb21133e5 by Victor Stinner in branch 'master':
bpo-38547: Fix test_pty if the process is the session leader (GH-17519)
https://github.com/python/cpython/commit/a1838ec2592e5082c75c77888f2a7a3eb21133e5
msg358067 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-12-09 11:00
Ok, I merged my fix to master. The backport to 3.7 and 3.8 will follow quickly. I close the issue.

Sorry for the inconvenience ;-)
msg358068 - (view) Author: miss-islington (miss-islington) Date: 2019-12-09 11:15
New changeset b9f4b49c6e525afd6fce02cfc14be52e98a18f67 by Miss Islington (bot) in branch '3.7':
bpo-38547: Fix test_pty if the process is the session leader (GH-17519)
https://github.com/python/cpython/commit/b9f4b49c6e525afd6fce02cfc14be52e98a18f67
msg358070 - (view) Author: miss-islington (miss-islington) Date: 2019-12-09 11:15
New changeset d08fd298dc8d5631f6c504d01ee4f9cfb47db79d by Miss Islington (bot) in branch '3.8':
bpo-38547: Fix test_pty if the process is the session leader (GH-17519)
https://github.com/python/cpython/commit/d08fd298dc8d5631f6c504d01ee4f9cfb47db79d
History
Date User Action Args
2019-12-09 11:15:27miss-islingtonsetmessages: + msg358070
2019-12-09 11:15:10miss-islingtonsetnosy: + miss-islington
messages: + msg358068
2019-12-09 11:00:12vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg358067

stage: patch review -> resolved
2019-12-09 10:57:36miss-islingtonsetpull_requests: + pull_request16999
2019-12-09 10:57:30miss-islingtonsetpull_requests: + pull_request16998
2019-12-09 10:57:18vstinnersetmessages: + msg358066
2019-12-09 10:36:19vstinnersetmessages: + msg358065
2019-12-09 10:34:53vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request16997
2019-11-24 23:27:30db3lsetmessages: + msg357419
2019-11-24 22:50:43vstinnersetmessages: + msg357416
2019-11-24 22:46:50pablogsalsetmessages: + msg357415
2019-11-24 22:16:04vstinnersetmessages: + msg357414
2019-11-24 17:58:03pablogsalsetmessages: + msg357403
2019-11-24 17:55:26pablogsalsetnosy: + pablogsal
messages: + msg357402
2019-10-22 05:03:08db3lsetnosy: + db3l
2019-10-21 11:38:39vstinnercreate