classification
Title: Start the deprecation cycle for subprocess preexec_fn
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: diekhans, gregory.p.smith, izbyshev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2019-10-10 18:18 by gregory.p.smith, last changed 2021-10-01 18:59 by gregory.p.smith.

Pull Requests
URL Status Linked Edit
PR 23930 open gregory.p.smith, 2020-12-25 03:14
PR 23936 open gregory.p.smith, 2020-12-25 06:28
Messages (17)
msg354397 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-10-10 18:18
subprocess's preexec_fn feature needs to enter PendingDeprecationWarning state in 3.9, to become a DeprecationWarning in 3.10, with a goal of removing it in 3.11.

Rationale: We now live in a world full of threads, it is entirely unsafe to call back into the python interpreter within the forked child before exec per POSIX specification.

We've also already made preexec_fn no longer supported from CPython subinterpreters in 3.8.

If there are not already issues open for desired features of subprocess that do not yet have replacements or workarounds for *specific* actions that preexec_fn is being used for in your application, please open feature requests for those.  (ex: calling umask is https://bugs.python.org/issue38417, and group, uid, gid setting has already landed in 3.9)
msg354404 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-10 20:44
What is the recommanded way to replace preexec_fn?
msg354407 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-10-10 21:31
With task specific arguments.  cwd, start_new_session, group, extra_groups,
user, etc..

We cannot provide a general do everything replacement and should not try.
It not possible.
msg354439 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-11 11:00
> We cannot provide a general do everything replacement and should not try. It not possible.

Well, I proposed a solution at:
https://bugs.python.org/issue38417#msg354242

I know that this solution has multiple flaws, but a bad solution may be better than no solution: breaking applications when upgrading to Python 3.11.
msg383708 - (view) Author: Mark Diekhans (diekhans) Date: 2020-12-24 23:53
calling setpgid() is a common post-fork task that would need to be an explicit parameter to Popen when preexec_fn goes away
msg383720 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 05:27
PR up to add setpgid support.  From what I've come across, some setpgid() users can use setsid() already available via start_new_session= instead.  But rather than getting into the differences between those, making both available makes sense to allow for anyone's case where setsid() isn't desired.
msg383724 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 05:55
https://bugs.python.org/issue42736 filed to track adding Linux prctl() support.
msg383725 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 05:59
Another preexec_fn use to consider:

 resource.setrlimit(resource.RLIMIT_CORE, (XXX, XXX))

Using an intermediate shell script wrapper that changes the rlimit and exec's the actual process is also an alternative.
msg383726 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 06:00
I'm also seeing a lot of os.setpgrp() calls, though those are more likely able to use start_new_session to do setsid() as a dropin replacement.
msg383727 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 06:05
signal.signal use case:

Calls to signal.signal(x, y) to sometimes to set a non SIG_DFL behavior on a signal.  SIGINT -> SIG_IGN for example.

I see a lot of legacy looking code calling signal.signal in prexec_fn that appears to set SIG_DFL for the signals that Python otherwise modifies.  Which restore_signals=True should already be doing.
msg383728 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 06:31
Doing the code inspection of existing preexec_fn= users within our codebase at work is revealing.  But those seem to be the bulk of uses.

I expect this deprecation to take a while.  Ex: if we mark it as PendingDeprecationWarning in 3.10, I'd still wait until _at least_ 3.13 to remove it.

Code using it often has a long legacy and may be written to run on a wide variety of Python versions.  It's painful to do so when features you need in order to stop using it are still only in modern versions.
msg383736 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-25 11:17
> Using an intermediate shell script wrapper that changes the rlimit and exec's the actual process is also an alternative.

IMO using Python is more portable than relying on a shell.

I dislike relying on a shell since shells are not really portable (behave differently), unless you restrict yourself to a strict POSIX subset of the shell programming language. While '/bin/sh' is available on most Unix, Android uses '/system/bin/sh', and Windows and VxWorks have no shell (Windows provides cmd.exe which uses Batch programming language, and there are other scripting languages like VBS or PowerShell: so you need a complete different implementation for Windows).

For the oslo.concurrency project, I wrote the Python script prlimit.py: a wrapper calling resource.setrlimit() and then execute a new program. It's similar to the Unix prlimit program, but written in Python to be portable (the "prlimit" program is not available on all platforms).

https://github.com/openstack/oslo.concurrency/blob/master/oslo_concurrency/prlimit.py

I suggest to not provide a builtin wrapper to replace preexec_fn, but suggest replacements in the subprocess and What's New in Python 3.11 documentation (explain how to port existing code).

More generally, the whole preeexc_fn feature could be reimplemented a third-party project by spawning a *new* Python process, run the Python code, and *then* execute the final process. The main feature of preexec_fn is to give the ability to run a function of the current process, whereas what I'm discussing would be code written as a string.

--

preexec_fn can be used for non-trivial issues like only sharing some Windows HANDLE, see:
https://www.python.org/dev/peps/pep-0446/#only-inherit-some-handles-on-windows

Note: This specific problem has been solved the proper way in Python by adding support for PROC_THREAD_ATTRIBUTE_HANDLE_LIST in subprocess.STARTUPINFO: lpAttributeList['handle_list'] parameter.
msg383740 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-25 11:53
I just created bpo-42738: "subprocess: don't close all file descriptors by default (close_fds=False)".
msg383863 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-27 20:42
> "using Python is more portable than relying on a shell."

Not in environments I use. :)  There isn't an installed python interpreter that can be executed when deploying Python as an embedded interpreter such as anyone using pyoxidizer or similar.  Plus "using python" means adding a Python startup time delay to anything that triggered such an action.  That added latency isn't acceptable in some situations.

When I suggest a workaround for something as involving an intermediate shell script, read that to mean "the user needs an intermediate program to do this complicated work for them - how is up to them - we aren't going to provide it from the stdlib".  A shell script is merely one easy pretty-fast solution - in environments where that is possible.

TL;DR - there's no one size fits all solution here.  But third party libraries could indeed implement any/all of these options including abstracting how and what gets used when if someone wanted to do that.
msg383988 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-29 12:22
Would not be more consistent with other parameters to name the new parameter "pgid" or "process_group"?

And why not use None as default, like for user and group?
msg400498 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-08-28 19:34
A worthwhile general suggestion on a new path forward for the mess of things between (v)fork+exec from Victor is over in https://bugs.python.org/issue42736#msg383869

TL;DR creating a subprocess.Preexec() recording object with specific interfaces for things that can be done, an instance of which gets passed in and the recorded actions are done as appropriate.
msg403026 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-10-01 18:59
Another use case someone had for preexec_fn popped up today:

 prctl(PR_SET_PDEATHSIG, SIGTERM)
History
Date User Action Args
2021-10-01 18:59:39gregory.p.smithsetmessages: + msg403026
2021-08-28 19:34:41gregory.p.smithsetmessages: + msg400498
versions: + Python 3.11, - Python 3.10
2020-12-29 12:22:31serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg383988
2020-12-27 20:42:26gregory.p.smithsetmessages: + msg383863
2020-12-26 09:53:42izbyshevsetnosy: + izbyshev
2020-12-25 11:53:22vstinnersetmessages: + msg383740
2020-12-25 11:17:25vstinnersetmessages: + msg383736
2020-12-25 06:31:32gregory.p.smithsetmessages: + msg383728
components: + Library (Lib)
2020-12-25 06:28:06gregory.p.smithsetpull_requests: + pull_request22786
2020-12-25 06:05:25gregory.p.smithsetmessages: + msg383727
2020-12-25 06:00:26gregory.p.smithsetmessages: + msg383726
2020-12-25 05:59:19gregory.p.smithsetmessages: + msg383725
2020-12-25 05:55:34gregory.p.smithsetmessages: + msg383724
2020-12-25 05:27:12gregory.p.smithsetmessages: + msg383720
2020-12-25 04:52:04gregory.p.smithsetversions: + Python 3.10, - Python 3.9
2020-12-25 03:14:56gregory.p.smithsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request22780
2020-12-24 23:53:41diekhanssetnosy: + diekhans
messages: + msg383708
2019-10-11 11:00:31vstinnersetmessages: + msg354439
2019-10-10 21:31:38gregory.p.smithsetmessages: + msg354407
2019-10-10 20:44:47vstinnersetnosy: + vstinner
messages: + msg354404
2019-10-10 18:18:17gregory.p.smithcreate