classification
Title: add a pidfd child process watcher
Type: enhancement Stage: patch review
Components: asyncio Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: aeros, asvetlov, benjamin.peterson, nanjekyejoannah, njs, vstinner, yan12125, yselivanov
Priority: normal Keywords: patch

Created on 2019-11-05 05:43 by benjamin.peterson, last changed 2019-11-09 09:57 by yan12125.

Files
File name Uploaded Description Edit
0001-asyncio-Add-a-pidfd-child-process-watcher.patch benjamin.peterson, 2019-11-05 05:43
Pull Requests
URL Status Linked Edit
PR 17063 merged benjamin.peterson, 2019-11-05 15:24
PR 17069 open benjamin.peterson, 2019-11-06 03:23
Messages (21)
msg356001 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2019-11-05 05:43
Recent versions of Linux has built out support for pidfd, a way to do process management with file descriptors. I wanted to try it out, so implemented a pidfd-based child watcher for asyncio.

My WIP progress patch is attached. It passes all asyncio tests.
msg356016 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-11-05 10:35
Awesome!

I think the patch can be splitted in os.pidfd_open() and asyncio part itself.

os modification is clear.

asyncio part looks correct after the brief view. 
My the main question is: how to detect if the new watcher can be used or asyncio should fallback to threaded based solution?
msg356017 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-11-05 10:36
Nathaniel, you may be interested in the pidfd_open(), at least in adding the function to os module.
msg356024 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-05 11:08
I like the idea of exposing pidfd_open() syscall as os.pidfd_open().

Benjamin: Would you mind to create a PR based on your patch, but without your asyncio change?

> +#ifndef __NR_pidfd_open

I would prefer to avoid this part of the patch.

You should accept an optional flags parameter, no?
msg356026 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-05 11:11
+        self._loop._remove_reader(pidfd)
+        os.close(pidfd)

Note: Maybe do these operations in the reverse order. I expect a higher risk of exception when calling Python self._loop._remove_reader(), than on closing a FD that we opened.
msg356047 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2019-11-05 17:31
Thanks for the CC.

It would be nice to get `pidfd_send_signal` as well, while we're at it. But that could be a separate PR.

AFAICT the other bits of the pidfd API right now are some extra flags to clone, and an extra flag to waitid. The waitid flag is nice-to-have but not really urgent, since it's easy enough to use a flag even if the stdlib doesn't explicitly expose it. The clone flags are super interesting, but before we can use them, we need to have some way to access clone itself, and that's a big complicated project, so we definitely shouldn't worry about it here.
msg356048 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2019-11-05 17:35
Yes, I will be submitting followup changes for pidfd_send_signal and the other goodies.

I would like to use pidfds in subprocess, but as you as you say, that's another kettle of fish.
msg356068 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-05 22:02
See also bpo-38630 "subprocess.Popen.send_signal() should poll the process".
msg356069 - (view) Author: Kyle Stanley (aeros) * (Python triager) Date: 2019-11-05 22:07
> My the main question is: how to detect if the new watcher can be used or asyncio should fallback to threaded based solution?

Perhaps in the __init__ we could do something like this:

class PidfdChildWatcher(AbstractChildWatcher):

    def __init__(self):
        if not hasattr(os, "pidfd_open"):
             raise RuntimeError("os.pidfd_open() is unavailable.")
        ...

Thoughts?

> My WIP progress patch is attached. It passes all asyncio tests.

I think we could also specifically look for race conditions with `./python -m test test_asyncio.test_subprocess -F -j4`, rather than relying on the tests passing with a single job. The other child watchers have been particularly infamous when it comes race conditions and resource leaks. 

I'd be glad to work on testing and opening a PR to asyncio if needed (while giving Benjamin credit for the patch of course).
msg356071 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2019-11-05 22:39
hasattr is useful for supporting old versions of the os module, but asyncio doesn't have to care about that. You should probably try calling pidfd_open and see whether you get -ESYSCALL, and also maybe try passing it to poll(), since I think there might have been a release where the syscall existed but it wasn't pollable. Not sure what the error for that looks like.
msg356079 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2019-11-06 00:05
pidfd_open was added after pidfd pollling, so it should suffice to make sure pidfd_open doesn't ENOSYS.
msg356087 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2019-11-06 03:21
New changeset 6c4c45efaeb40f4f837570f57d90a0b3429c6ae9 by Benjamin Peterson in branch 'master':
bpo-38692: Add os.pidfd_open. (GH-17063)
https://github.com/python/cpython/commit/6c4c45efaeb40f4f837570f57d90a0b3429c6ae9
msg356148 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-11-06 17:57
Would it be useful to use a pidfd in subprocess.Popen to fix bpo-38630 root issue, when pidfd is available?
msg356164 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2019-11-07 02:57
On Wed, Nov 6, 2019, at 09:57, STINNER Victor wrote:
> 
> STINNER Victor <vstinner@python.org> added the comment:
> 
> Would it be useful to use a pidfd in subprocess.Popen to fix bpo-38630 
> root issue, when pidfd is available?

Probably, but as noted above, we need to figure out the best way to integrate pidfds into subprocess. (Probably not in this issue.)
msg356235 - (view) Author: Chih-Hsuan Yen (yan12125) * Date: 2019-11-08 10:12
I got a failure in newly added test_pidfd_open:

======================================================================
FAIL: test_pidfd_open (test.test_posix.PosixTester)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/build/python-git/src/cpython/Lib/test/test_posix.py", line 1479, in test_pidfd_open
    self.assertEqual(cm.exception.errno, errno.EINVAL)
AssertionError: 1 != 22

----------------------------------------------------------------------

I'm running kernel 5.3.7-x86_64-linode130 with Arch Linux. At first I suspect that it's related to system call filters from systemd as tests are run in a systemd-nspawn container. However, there are still failures after patching systemd with https://github.com/systemd/systemd/commit/9e486265716963439fb0fd7f2a97abf109f24f75.

How about also skipping the test if pidfd_open returns EPERM?
msg356251 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2019-11-08 17:03
I think you must still be experiencing some sort of sandboxing. I don't know how else you would get an EPERM out of pidfd_open.
msg356286 - (view) Author: Kyle Stanley (aeros) * (Python triager) Date: 2019-11-09 06:06
> I got a failure in newly added test_pidfd_open:

> I'm running kernel 5.3.7-x86_64-linode130 with Arch Linux.

> I think you must still be experiencing some sort of sandboxing. I don't know how else you would get an EPERM out of pidfd_open.

I believe Benjamin is correct. On a native install of Arch Linux with kernel 5.3.7 (using latest updates from the official repos), I received no failures in test_posix. 

[aeros:~/repos/benjaminp-cpython]$ ./python -m test test_posix     (asyncio-pidfd) 
0:00:00 load avg: 1.86 Run tests sequentially
0:00:00 load avg: 1.86 [1/1] test_posix

== Tests result: SUCCESS ==

1 test OK.

Total duration: 544 ms
Tests result: SUCCESS

To confirm there weren't intermittent failures, I also ran test_posix indefinitely, sending SIGINT after ~2500 iterations. No failures occurred:

[aeros:~/repos/benjaminp-cpython]$ ./python -m test test_pty -F    (asyncio-pidfd) 
...
0:01:31 load avg: 1.57 [2506] test_pty
0:01:31 load avg: 1.57 [2507] test_pty
^C

== Tests result: INTERRUPTED ==
Test suite interrupted by signal SIGINT.

2506 tests OK.

Total duration: 1 min 31 sec
Tests result: INTERRUPTED

It seems that the issue is likely specific to Chih-Hsuan Yen's environment.
msg356287 - (view) Author: Kyle Stanley (aeros) * (Python triager) Date: 2019-11-09 06:09
> [aeros:~/repos/benjaminp-cpython]$ ./python -m test test_pty -F    (asyncio-pidfd) 
...
0:01:31 load avg: 1.57 [2506] test_pty
0:01:31 load avg: 1.57 [2507] test_pty

Oops, looks like I copied the wrong results of a separate test I was running earlier. I'll post the results of ~1000 iterations of test_posix below, once it is completed again.
msg356288 - (view) Author: Kyle Stanley (aeros) * (Python triager) Date: 2019-11-09 06:17
[aeros:~/repos/benjaminp-cpython]$ ./python -m test test_posix -F  (asyncio-pidfd) 
...
0:08:52 load avg: 1.89 [1008] test_posix
0:08:52 load avg: 2.22 [1009] test_posix
...

1008 tests OK.

Total duration: 8 min 52 sec
Tests result: INTERRUPTED
msg356289 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2019-11-09 06:31
It seems like systemd-nspawn is just breaking everything: https://sourceware.org/ml/libc-alpha/2019-11/msg00277.html
msg356292 - (view) Author: Chih-Hsuan Yen (yan12125) * Date: 2019-11-09 09:57
Benjamin Peterson, Kyle Stanley: Thank you both very much for intensive testing and helpful information! Yes on my Arch Linux box tests are also fine without systemd-nspawn. However, as I have to use systemd-nspawn for some reasons, and investigating how it blocks system calls is beyond my ability, I just add a workaround for now: https://aur.archlinux.org/cgit/aur.git/tree/python-git-issue38692.diff?h=python-git.
History
Date User Action Args
2019-11-09 09:57:44yan12125setmessages: + msg356292
2019-11-09 06:31:31benjamin.petersonsetmessages: + msg356289
2019-11-09 06:17:38aerossetmessages: + msg356288
2019-11-09 06:09:54aerossetmessages: + msg356287
2019-11-09 06:06:04aerossetmessages: + msg356286
2019-11-08 17:03:52benjamin.petersonsetmessages: + msg356251
2019-11-08 10:12:36yan12125setnosy: + yan12125
messages: + msg356235
2019-11-07 02:57:22benjamin.petersonsetmessages: + msg356164
2019-11-06 17:57:20vstinnersetmessages: + msg356148
2019-11-06 03:23:58benjamin.petersonsetpull_requests: + pull_request16578
2019-11-06 03:21:43benjamin.petersonsetmessages: + msg356087
2019-11-06 00:05:45benjamin.petersonsetmessages: + msg356079
2019-11-05 22:39:52njssetmessages: + msg356071
2019-11-05 22:07:22aerossetnosy: + aeros
messages: + msg356069
2019-11-05 22:02:10vstinnersetmessages: + msg356068
2019-11-05 17:35:00benjamin.petersonsetmessages: + msg356048
2019-11-05 17:31:32njssetmessages: + msg356047
2019-11-05 15:24:28benjamin.petersonsetstage: patch review
pull_requests: + pull_request16571
2019-11-05 11:11:18vstinnersetnosy: + nanjekyejoannah
2019-11-05 11:11:07vstinnersetmessages: + msg356026
2019-11-05 11:08:16vstinnersetnosy: + vstinner
messages: + msg356024
2019-11-05 10:36:51asvetlovsetnosy: + njs
messages: + msg356017
2019-11-05 10:35:29asvetlovsetmessages: + msg356016
2019-11-05 05:43:04benjamin.petersoncreate