classification
Title: FreeBSD: Optimize subprocess.Popen(close_fds=True) using closefrom()
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: koobs, vstinner
Priority: normal Keywords: patch

Created on 2019-09-09 07:25 by vstinner, last changed 2020-04-24 10:14 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 19696 merged vstinner, 2020-04-23 23:27
PR 19697 merged vstinner, 2020-04-23 23:59
Messages (18)
msg351351 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-09 07:25
The default value of subprocess.Popen "close_fds" parameter changed to True in Python 3. Compared to Python 2, close_fds=True can make Popen 10x slower: see bpo-37790.

A single close(fd) syscall is cheap. But when MAXFDS (maximum file descriptor number) is high, the loop calling close(fd) on each file descriptor can take several milliseconds.

On FreeBSD, the _posixsubprocess could use closefrom() to reduce the number of syscalls.

On Linux, _posixsubprocess lists the content of /proc/self/fd/ to only close open file descriptor, after fork() and before exec().
msg351353 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-09 07:28
A workaround is to use close_fds=False which is only safe if all code used in Python implements the PEP 446.

This PEP mentions the FreeBSD issue, mentioning bpo-11284 with MAXFD=655,000:

"The operation can be slow if MAXFD is large. For example, on a FreeBSD buildbot with MAXFD=655,000, the operation took 300 ms: see issue #11284: slow close file descriptors."

Note: the subprocess module is now able to use posix_spawn() syscall, but it doesn't use it if close_fds=True, so it's outside the scope of this issue.
msg351355 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-09 07:31
See also bpo-13788: "os.closerange optimization".
msg351356 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-09 07:41
> On Linux, _posixsubprocess lists the content of /proc/self/fd/ to only close open file descriptor, after fork() and before exec().

This code is specific to Linux because it uses the SYS_getdents64 syscall.

FreeBSD has a similar concept using /dev/fd "file-descriptor file system". See fdescfs manual page:
https://www.freebsd.org/cgi/man.cgi?query=fdescfs&sektion=5

I'm not sure how it is supposed to work. When I open a file in Python, I don't see its file descriptor in /dev/fd:

vstinner@freebsd$ python3
Python 3.6.9 (default, Aug  8 2019, 01:16:42) 
>>> import os
>>> os.listdir("/dev/fd")
['0', '1', '2']
>>> f=open("/etc/motd")
>>> os.listdir("/dev/fd")
['0', '1', '2']
>>> f.fileno()
3

>>> os.set_inheritable(f.fileno(), True)
>>> os.listdir("/dev/fd")
['0', '1', '2']

>>> import sys, subprocess
>>> rc=subprocess.call([sys.executable, "-c", "import os; print(os.listdir('/dev/fd')); print(os.fstat(3))"], close_fds=False)
['0', '1', '2']
os.stat_result(st_mode=33188, st_ino=321134, st_dev=83, st_nlink=1, st_uid=0, st_gid=0, st_size=899, st_atime=1568014491, st_mtime=1566390614, st_ctime=1566390614)


The file descriptor 3 is not listed in /dev/fd/. It is inherited: fstat(3) in a child process doesn't fail and it's not listed in /dev/fd/ in the child process.
msg351357 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-09 08:16
Oh, _posixsubprocess uses /dev/fd/ on FreeBSD, but only if it's mounted file system (expected to be fdescfs filesystem):

#if defined(__FreeBSD__) || (defined(__APPLE__) && defined(__MACH__))
# define FD_DIR "/dev/fd"
#else
# define FD_DIR "/proc/self/fd"
#endif

#if defined(__FreeBSD__)
/* When /dev/fd isn't mounted it is often a static directory populated
 * with 0 1 2 or entries for 0 .. 63 on FreeBSD, NetBSD and OpenBSD.
 * NetBSD and OpenBSD have a /proc fs available (though not necessarily
 * mounted) and do not have fdescfs for /dev/fd.  MacOS X has a devfs
 * that properly supports /dev/fd.
 */
static int
_is_fdescfs_mounted_on_dev_fd(void)
{
    struct stat dev_stat;
    struct stat dev_fd_stat;
    if (stat("/dev", &dev_stat) != 0)
        return 0;
    if (stat(FD_DIR, &dev_fd_stat) != 0)
        return 0;
    if (dev_stat.st_dev == dev_fd_stat.st_dev)
        return 0;  /* / == /dev == /dev/fd means it is static. #fail */
    return 1;
}
#endif


On my FreeBSD 12 VM, MAXFD is around 230k which is quite large.

vstinner@freebsd$ uname -a
FreeBSD freebsd 12.0-RELEASE-p10 FreeBSD 12.0-RELEASE-p10 GENERIC  amd64

vstinner@freebsd$ ./python 
Python 3.9.0a0 (heads/master:4db25d5c39, Sep  9 2019, 07:52:01) 
>>> import os; os.sysconf("SC_OPEN_MAX")
229284

It's easy to measure the overhead of 230k close() syscalls:

vstinner@freebsd$ python3 -m pyperf timeit -s 'import subprocess; args=["/usr/bin/true"]' 'subprocess.Popen(args, close_fds=False).wait()' 
.....................
Mean +- std dev: 1.22 ms +- 0.12 ms
vstinner@freebsd$ python3 -m pyperf timeit -s 'import subprocess; args=["/usr/bin/true"]' 'subprocess.Popen(args, close_fds=True).wait()' 
.....................
Mean +- std dev: 53.3 ms +- 1.4 ms

The overhead is 52.08 ms per Popen().wait() call (with close_fds=True).



If I mount manually the fdescfs filesystem, suddenly, subprocess is efficient again:

vstinner@freebsd$ sudo mount -t fdescfs  /dev/fd 
vstinner@freebsd$ python3 -m pyperf timeit -s 'import subprocess; args=["/usr/bin/true"]' 'subprocess.Popen(args, close_fds=True).wait()' 
.....................
Mean +- std dev: 1.20 ms +- 0.06 ms

close_fds=True becomes as efficient as close_fds=False.
msg351358 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-09 08:33
> FreeBSD has a similar concept using /dev/fd "file-descriptor file system". (...) I'm not sure how it is supposed to work.

Sadly, on my FreeBSD VM, it seems like /dev/fd/ is not mounted with fdescfs by default, but as a regular directory with 3 hardcoded files 0, 1 and 2 which are character devices.

I had to mount fdescfs filesystem manually at /dev/fd/ :-(

$ cat /etc/fstab 
# Custom /etc/fstab for FreeBSD VM images
/dev/gpt/rootfs   /       ufs     rw      1       1
/dev/gpt/swapfs  none    swap    sw      0       0

Maybe it's an issue with "FreeBSD VM images" that I chose.

--

The FreeBSD CURRENT buildbot worker mounts /dev/fd:

CURRENT-amd64% cat /etc/fstab
# Device	Mountpoint	FStype	Options	Dump	Pass#
/dev/da0p2	none		swap	sw	0	0
/dev/da0p3	/		ufs	rw	1	1
fdescfs		/dev/fd		fdescfs	rw	0	0

CURRENT-amd64% mount|grep fd
fdescfs on /dev/fd (fdescfs)
msg351367 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2019-09-09 09:02
We're tracking this in our downstream bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700

There's a patch there you may base something upstream on

Open question:

closefrom(2) test in ./configure && ifdef __FreeBSD__ ?
msg351369 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2019-09-09 09:07
@Victor I mounted fdescfs on the buildbot workers to make the tests run faster. You're correct that it's not enabled by default.

If we could look at the initial support being as simple as possible, we can look to backport this to 2.7 as well
msg351383 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-09 09:40
> @Victor I mounted fdescfs on the buildbot workers to make the tests run faster. You're correct that it's not enabled by default.

Would it be possible to modify FreeBSD to enable it by default? Or is there a reason to not enable it by default?

> We're tracking this in our downstream bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700 There's a patch there you may base something upstream on

Which patch? https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700#c3 or something else?


> Open question: closefrom(2) test in ./configure && ifdef __FreeBSD__ ?

No need to make closefrom test conditional depending on the platform. We have plently of tests which are platform specific but run on all platforms.

It's common that a function is first available on a single platform, but then available on more platforms.

It seems like closefrom() is available on:

* FreeBSD
* NetBSD: https://netbsd.gw.com/cgi-bin/man-cgi?closefrom++NetBSD-current
* OpenBSD: https://man.openbsd.org/closefrom.2
* Solaris: https://docs.oracle.com/cd/E23823_01/html/816-5168/closefrom-3c.html
msg351457 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2019-09-09 13:26
> Would it be possible to modify FreeBSD to enable it by default? Or is there a reason to not enable it by default?

That's very unlikely to happen. I believe it was added as an opt-in feature for some linux compatibility situations. The correct solution is to use closefrom(2), as it is the optimal current solution, and in our case, safe to use.

> Which patch? https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700#c3 or something else?

Yep, that one.

> It seems like closefrom() is available on:

Per #8052 there was some concerns about closefrom(2) not being async-signal safe. [1] 

I can't speak to other implementations, but this is what I requested clarity from our FreeBSD developers on, and confirmed that our implementation is indeed safe. That is the reason why I thought scoping closefrom(2) to __FreeBSD__ may be warranted, just like like the linux specific bits in https://hg.python.org/cpython/rev/61aa484a3e54 from #8052

But I'll leave the decision as to how its implemented (configure checkls, ifdef'd or not) in your capable hands.

Summary: FreeBSD's closefrom(2) is safe to anywhere in Python where it needs to close a range of fd's, including the subprocess module.

[1] https://bugs.python.org/issue8052#msg132307
msg351461 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-09 13:36
> That's very unlikely to happen. I believe it was added as an opt-in feature for some linux compatibility situations. The correct solution is to use closefrom(2), as it is the optimal current solution, and in our case, safe to use.

If you have like 50k open file descriptors and want to pass fd 50000 to a child process using Popen(pass_fds={50000}), you will have to call close() 49 997 times: close(3)..close(49999), then closefrom(50001).

For the pass_fds case, being able to enumerate open file descriptors (ex: fdescfs) is more efficient than closefrom().

Well, the best would be a system call "close all file descriptors except of <list of fds>", but it's not available yet. Or maybe at least having a system like close_range(a, b).

In May 2019, the close_range() syscall was in the Linux kernel !
https://lwn.net/Articles/789023/
... but it seems like the feature was not accepted yet.

I'm disappointed that posix_spawn() doesn't implement a feature like Python subprocess close_fds=True.
msg351465 - (view) Author: Kubilay Kocak (koobs) (Python triager) Date: 2019-09-09 13:42
On "close_except", close_range and posix_spawn(close_fds=True), I'll talk to my interested FreeBSD colleagues about potential inroads in that regard.

In the meantime, we're good on closefrom(2) anywhere it can be used in Python
msg361750 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-02-10 23:41
FreeBSD change has been merged:
"lang/python{27,35,36,37,38}: Add closefrom(2) support"
https://svnweb.freebsd.org/ports?view=revision&revision=518640
msg367161 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-24 00:14
Links to the FreeBSD downstream patches:

* https://reviews.freebsd.org/rP518640
* https://svnweb.freebsd.org/ports?view=revision&revision=518640
* https://svnweb.freebsd.org/ports/head/lang/python38/files/
* https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=242274
* "Patches based on emaste/cem contributions in PR 221700." wrote Kyle Evans in the bugzilla issue
* "cem" is Conrad Meyer: https://reviews.freebsd.org/p/cem/
* "emaste" is Ed Maste: https://reviews.freebsd.org/p/emaste/
msg367163 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-24 00:34
Copy of the very interesting https://svnweb.freebsd.org/ports?view=revision&revision=518640 commit message by koobs:


Log Message: 	

lang/python{27,35,36,37,38}: Add closefrom(2) support

A single close(fd) syscall is cheap, but when MAXFDS (maximum file
descriptor number) is high, the loop calling close(fd) on each file
descriptor can take several milliseconds.

The default value of subprocess.Popen "close_fds" parameter changed to True
in Python 3. Compared to Python 2, close_fds=True can make Popen 10x
slower: see bpo-37790 [1]

The present workaround on FreeBSD to improve performance is to load and
mount the fdescfs kernel module, but this is not enabled by default.

This change adds minimum viable (and upstreamable) closefrom(2) syscall
support to Python's subprocess and posix modules, improving performance
significantly for loads that involve working with many processes, such as
diffoscope, ansible, and many others.

For additional optimizations, upstream recently (3.8) landed posix_spawn(2)
support [3] and has stated that they will adopt close_range(2) after Linux
merges it [4]. Linux/FreeBSD developers are already collaborating on
ensuring compatible implementations, with FreeBSD's implementation pending
in D21627. [5]

Thank you emaste, cem, kevans for providing analysis, input,
clarifications, comms/upstream support and patches.

[1] https://bugs.python.org/issue37790
[2] https://bugs.python.org/issue38061
[3] https://bugs.python.org/issue35537
[4] https://lwn.net/Articles/789023/
[5] https://reviews.freebsd.org/D21627

Additional References:

https://bugs.python.org/issue8052
https://bugs.python.org/issue11284
https://bugs.python.org/issue13788
https://bugs.python.org/issue1663329
https://www.python.org/dev/peps/pep-0446/

PR:		242274, 221700
Submitted by:	kevans (emaste, cem)
Approved by:	koobs (python (maintainer), santa)
msg367183 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-24 10:00
New changeset 162c567d164b5742c0d1f3f8bd8c8bab9c117cd0 by Victor Stinner in branch 'master':
bpo-38061: os.closerange() uses closefrom() on FreeBSD (GH-19696)
https://github.com/python/cpython/commit/162c567d164b5742c0d1f3f8bd8c8bab9c117cd0
msg367184 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-24 10:07
New changeset e6f8abd500751a834b6fff4f107ecbd29f2184fe by Victor Stinner in branch 'master':
bpo-38061: subprocess uses closefrom() on FreeBSD (GH-19697)
https://github.com/python/cpython/commit/e6f8abd500751a834b6fff4f107ecbd29f2184fe
msg367186 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-04-24 10:14
Thanks Ed Maste, Conrad Meyer, Kyle Evans and Kubilay Kocak!

I ran a benchmark:

vstinner@freebsd$ env/bin/python -m pyperf command -v -- ./python -c 'import subprocess, sys; args=[sys.executable, "-sS", "-c", "pass"]; subprocess.run(args)'

The optimization made subprocess almost 4x faster on FreeBSD!

vstinner@freebsd$ env/bin/python -m pyperf compare_to ref.json closefrom.json 
Mean +- std dev: [ref] 176 ms +- 1 ms -> [closefrom] 46.6 ms +- 1.2 ms: 3.77x faster (-74%)

129 ms were wasted in calling 229 284 times close(fd)!
History
Date User Action Args
2020-04-24 10:14:59vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg367186

stage: patch review -> resolved
2020-04-24 10:07:02vstinnersetmessages: + msg367184
2020-04-24 10:00:55vstinnersetmessages: + msg367183
2020-04-24 00:34:00vstinnersetmessages: + msg367163
2020-04-24 00:14:28vstinnersetmessages: + msg367161
2020-04-23 23:59:36vstinnersetpull_requests: + pull_request19017
2020-04-23 23:27:51vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request19016
2020-02-10 23:41:37vstinnersetmessages: + msg361750
2019-09-09 13:42:44koobssetmessages: + msg351465
2019-09-09 13:36:56vstinnersetmessages: + msg351461
2019-09-09 13:26:37koobssetmessages: + msg351457
2019-09-09 09:40:43vstinnersetmessages: + msg351383
2019-09-09 09:07:35koobssetmessages: + msg351369
2019-09-09 09:02:21koobssetmessages: + msg351367
2019-09-09 08:33:37vstinnersetmessages: + msg351358
2019-09-09 08:16:13vstinnersetmessages: + msg351357
2019-09-09 07:41:13vstinnersetmessages: + msg351356
2019-09-09 07:31:58vstinnersetmessages: + msg351355
2019-09-09 07:29:00vstinnersetnosy: + koobs
2019-09-09 07:28:51vstinnersetmessages: + msg351353
2019-09-09 07:25:36vstinnercreate