Title: Start the deprecation cycle for subprocess preexec_fn
Created on 2019-10-10 18:18 by gregory.p.smith, last changed 2022-04-11 14:59 by admin.

Messages (18)
msg354397 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-10-10 18:18
subprocess's preexec_fn feature needs to enter PendingDeprecationWarning state in 3.9, to become a DeprecationWarning in 3.10, with a goal of removing it in 3.11.

Rationale: We now live in a world full of threads, it is entirely unsafe to call back into the python interpreter within the forked child before exec per POSIX specification.

We've also already made preexec_fn no longer supported from CPython subinterpreters in 3.8.

If there are not already issues open for desired features of subprocess that do not yet have replacements or workarounds for *specific* actions that preexec_fn is being used for in your application, please open feature requests for those.  (ex: calling umask is, and group, uid, gid setting has already landed in 3.9)
msg354404 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-10 20:44
What is the recommanded way to replace preexec_fn?
msg354407 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2019-10-10 21:31
With task specific arguments.  cwd, start_new_session, group, extra_groups,
user, etc..

We cannot provide a general do everything replacement and should not try.
It not possible.
msg354439 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-10-11 11:00
> We cannot provide a general do everything replacement and should not try. It not possible.

Well, I proposed a solution at:

I know that this solution has multiple flaws, but a bad solution may be better than no solution: breaking applications when upgrading to Python 3.11.
msg383708 - (view) Author: Mark Diekhans (diekhans) Date: 2020-12-24 23:53
calling setpgid() is a common post-fork task that would need to be an explicit parameter to Popen when preexec_fn goes away
msg383720 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 05:27
PR up to add setpgid support.  From what I've come across, some setpgid() users can use setsid() already available via start_new_session= instead.  But rather than getting into the differences between those, making both available makes sense to allow for anyone's case where setsid() isn't desired.
msg383724 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 05:55 filed to track adding Linux prctl() support.
msg383725 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 05:59
Another preexec_fn use to consider:

 resource.setrlimit(resource.RLIMIT_CORE, (XXX, XXX))

Using an intermediate shell script wrapper that changes the rlimit and exec's the actual process is also an alternative.
msg383726 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 06:00
I'm also seeing a lot of os.setpgrp() calls, though those are more likely able to use start_new_session to do setsid() as a dropin replacement.
msg383727 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 06:05
signal.signal use case:

Calls to signal.signal(x, y) to sometimes to set a non SIG_DFL behavior on a signal.  SIGINT -> SIG_IGN for example.

I see a lot of legacy looking code calling signal.signal in prexec_fn that appears to set SIG_DFL for the signals that Python otherwise modifies.  Which restore_signals=True should already be doing.
msg383728 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-25 06:31
Doing the code inspection of existing preexec_fn= users within our codebase at work is revealing.  But those seem to be the bulk of uses.

I expect this deprecation to take a while.  Ex: if we mark it as PendingDeprecationWarning in 3.10, I'd still wait until _at least_ 3.13 to remove it.

Code using it often has a long legacy and may be written to run on a wide variety of Python versions.  It's painful to do so when features you need in order to stop using it are still only in modern versions.
msg383736 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-25 11:17
> Using an intermediate shell script wrapper that changes the rlimit and exec's the actual process is also an alternative.

IMO using Python is more portable than relying on a shell.

I dislike relying on a shell since shells are not really portable (behave differently), unless you restrict yourself to a strict POSIX subset of the shell programming language. While '/bin/sh' is available on most Unix, Android uses '/system/bin/sh', and Windows and VxWorks have no shell (Windows provides cmd.exe which uses Batch programming language, and there are other scripting languages like VBS or PowerShell: so you need a complete different implementation for Windows).

For the oslo.concurrency project, I wrote the Python script a wrapper calling resource.setrlimit() and then execute a new program. It's similar to the Unix prlimit program, but written in Python to be portable (the "prlimit" program is not available on all platforms).

I suggest to not provide a builtin wrapper to replace preexec_fn, but suggest replacements in the subprocess and What's New in Python 3.11 documentation (explain how to port existing code).

More generally, the whole preeexc_fn feature could be reimplemented a third-party project by spawning a *new* Python process, run the Python code, and *then* execute the final process. The main feature of preexec_fn is to give the ability to run a function of the current process, whereas what I'm discussing would be code written as a string.


preexec_fn can be used for non-trivial issues like only sharing some Windows HANDLE, see:

Note: This specific problem has been solved the proper way in Python by adding support for PROC_THREAD_ATTRIBUTE_HANDLE_LIST in subprocess.STARTUPINFO: lpAttributeList['handle_list'] parameter.
msg383740 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-12-25 11:53
I just created bpo-42738: "subprocess: don't close all file descriptors by default (close_fds=False)".
msg383863 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2020-12-27 20:42
> "using Python is more portable than relying on a shell."

Not in environments I use. :)  There isn't an installed python interpreter that can be executed when deploying Python as an embedded interpreter such as anyone using pyoxidizer or similar.  Plus "using python" means adding a Python startup time delay to anything that triggered such an action.  That added latency isn't acceptable in some situations.

When I suggest a workaround for something as involving an intermediate shell script, read that to mean "the user needs an intermediate program to do this complicated work for them - how is up to them - we aren't going to provide it from the stdlib".  A shell script is merely one easy pretty-fast solution - in environments where that is possible.

TL;DR - there's no one size fits all solution here.  But third party libraries could indeed implement any/all of these options including abstracting how and what gets used when if someone wanted to do that.
msg383988 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-12-29 12:22
Would not be more consistent with other parameters to name the new parameter "pgid" or "process_group"?

And why not use None as default, like for user and group?
msg400498 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-08-28 19:34
A worthwhile general suggestion on a new path forward for the mess of things between (v)fork+exec from Victor is over in

TL;DR creating a subprocess.Preexec() recording object with specific interfaces for things that can be done, an instance of which gets passed in and the recorded actions are done as appropriate.
msg403026 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-10-01 18:59
Another use case someone had for preexec_fn popped up today:

msg415352 - (view) Author: Mark Mentovai (markmentovai) Date: 2022-03-16 16:54
Another use case for preexec_fn: establishing a new controlling terminal, typically in conjunction with start_new_session=True. A preexec_fn may do something like

    os.close(, os.O_RDWR)))

with discussion at
