classification
Title: py.exe should unset the __PYVENV_LAUNCHER__ environment variable
Type: behavior Stage: resolved
Components: Windows Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: steve.dower Nosy List: eryksun, miss-islington, paul.moore, steve.dower, tim.golden, vinay.sajip, zach.ware
Priority: normal Keywords: patch, patch, patch

Created on 2019-01-24 00:48 by eryksun, last changed 2019-01-26 01:44 by steve.dower. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 11677 merged steve.dower, 2019-01-25 20:47
PR 11677 merged steve.dower, 2019-01-25 20:47
PR 11677 merged steve.dower, 2019-01-25 20:47
PR 11680 merged miss-islington, 2019-01-25 23:00
PR 11680 merged miss-islington, 2019-01-25 23:00
PR 11680 merged miss-islington, 2019-01-25 23:00
Messages (12)
msg334275 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-01-24 00:48
In 3.7.2 on Windows, venv now uses a redirecting launcher 'script' for python[w].exe. This replaces (on Windows only) the setup requirement to copy or symlink the interpreter binaries (i.e. python[w].exe, python37.dll, and vcruntime140.dll). Apparently this is required to be able to install Python as a Windows Store app. I haven't experimented with it yet, so I'll just accept that as a given. I'm curious to find out whether using symlinks would still work, and even if it doesn't work for the app installation, whether we could still support that option for desktop installations. I've used `--symlinks` for a while now, since I grant the required privilege to the authenticated users group and thus don't have to elevate to create symlinks. (I know it's not so easy for many Windows users.)

The new launcher reads pyvenv.cfg to locate and execute the real python.exe. Since the process image is no longer in the virtual environment's "Scripts" directory (which has various consequences), we need a way to communicate the launcher's path to make Python use the virtual environment. A -X command-line option could work, but then all packages and tools that spawn worker processes, such as multiprocessing, would need to be updated to look for this option in sys._xoptions and propagate it. Instead the launcher sets a special "__PYVENV_LAUNCHER__" environment variable. This is reasonable because processes are usually created with a copy of the parent's environment. Some environment variables may be added or modified, but it's rare for a child process to get a completely new environment. (One example of the latter would be creating a process that runs as a different user.)

An oversight in the current ecosystem is that py.exe and the distlib entry-point launchers do not unset "__PYVENV_LAUNCHER__". Thus, when executing a script from a virtual environment (e.g. either directly via py.exe or via the .py file association), it will mistakenly be pinned into the virtual environment if it runs in Python 3.7. Similarly, pip.exe for an installed Python 3.7 will mistakenly install into the virtual environment. However, the latter is out of scope here since the entry-point launchers are in distlib. 

It's also a problem if we run the fully-qualified path for an installed Python 3.7, e.g. from shutil.which('python'). We can't automatically address this since it's exactly the reason "__PYVENV_LAUNCHER__" exists. We have to know to manually unset "__PYVENV_LAUNCHER__" in the environment that's passed to the child. This should be documented somewhere -- maybe in the venv docs.

It's not a problem if a script runs unqualified "python.exe" since the final result is usually the same, but for different reasons. With the launcher, it's locked down by inheriting "__PYVENV_LAUNCHER__". With the previous design, the application directory was the virtual environment's "Scripts" directory. Unqualified "python.exe" was thus pinned for CreateProcessW, which checks the application directory first. It's also not a problem if we run sys.executable, since previously that was the virtual environment's executable. (In 3.7.2, sys.executable gets set to the virtual environment's launcher, but that breaks multiprocessing. See issue 35797.)
msg334277 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-01-24 02:17
FWIW, a symlink should be able to launch the Store Python - the problem with the old approach was it never tried to launch the original Python, but only load it's DLLs which won't work (maybe there's a way to enable the app context, but that would still require a new redirector and so wasn't any simpler than a direct launch).

Otherwise, I agree with this analysis and proposal. Just needs a patch!
msg334283 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-01-24 06:43
It's worth noting the consequences of changing the application directory (i.e. the "%__APPDIR__%" dynamic variable from GetEnvironmentVariableW).
This is the first directory checked in most cases when Windows searches for a file, including the CreateProcess[AsUser]W EXE search path, the default LoadLibrary[Ex]W DLL search path, and the default SearchPathW generic search path. In NT 6.2+, these search paths respectively come from the runtime library functions RtlGetExePath, LdrGetDllPath, and RtlGetSearchPath. The search itself is implemented by RtlDosSearchPath_Ustr, except LoadLibraryExW uses it only when loading a DLL as a data file. (Loading as a module is implemented by LdrLoadDll, which uses a private search function, LdrpSearchPath; resolves API-set names; and reserves known system DLLs.) This search supports file redirection [1], but it's unlikely that Python scripts are creating their own activation contexts.

In practice maybe no scripts will be affected by this change. I'm just noting potential problems.

Changing the application directory will break scripts that depend on the environment's "Scripts" directory being in one of the above search paths. It will also break scripts that depend on it being the first directory searched in order to shadow system files that aren't otherwise reserved.

We can lessen the impact of this change by prepending the "Scripts" directory to PATH. However, some directories will unavoidably have precedence over it now. Specifically, for the CreateProcessW EXE search, it's any of the following directories:

    * "%__APPDIR__%"
        This is now the installed location of python.exe.
    * "%__CD__%"
        The current directory is searched if a name has a 
        backslash in it, and also if the environment variable
        NoDefaultCurrentDirectoryInExePath is not defined.
    * "%SystemRoot%\System32" (GetSystemDirectory)
    * "%SystemRoot%\System"
    * "%SystemRoot%" (GetWindowsDirectory)

Also, applications that call SetDefaultDllDirectories get a secure DLL search path that can only reference "%__APPDIR__%", "%SystemRoot%\System32", and directories added via AddDllDirectory or SetDllDirectory. They no longer use "%__CD__%", "%SystemRoot%\System", "%SystemRoot%", or PATH. An affected script will have to manually add the "Scripts" directory to the DLL search path via AddDllDirectory or SetDllDirectory. 

Scripts that wish to restrict searches to PATH should use shutil.which(). It also works more predictably for relative filenames. In particular, a relative filename that has a path separator in it (e.g. r"eggs\spam.txt") is only searched relative to the process current directory. In contrast, searches based on RtlDosSearchPath_Ustr handle all relative filenames, whether or not they contain a path separator, by trying every directory in the search path (e.g. it will look for r"%__APPDIR__%\eggs\spam.txt", and so on).


[1] If enabled by the call flags, searching for a relative filename
    can be redirected by an activation context or (in the absence 
    of a manifest) an "<exename>.local" directory in the application 
    directory, via RtlDosApplyFileIsolationRedirection_Ustr. 
    CreateProcessW does not enable redirection. SearchPathW enables 
    it for the default search path. LoadLibraryExW / LdrLoadDll needs 
    to redirect even fully-qualified filenames, so it directly calls
    RtlDosApplyFileIsolationRedirection_Ustr.
msg334301 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2019-01-24 15:44
I'm not particularly happy with the statement that "However, the latter is out of scope here since the entry-point launchers are in distlib."

If Python is changing the semantics of __PYVENV_LAUNCHER__ in such a way that it breaks 3rd party code, it seems to me that this is a bug, which should be addressed by reverting or fixing the change so that the 3rd party code does not need to change. Either that, or the change is publicised as a backward compatibility break, so that tools are given a chance to update.

Having said all of that, I appreciate that __PYVENV_LAUNCHER__ is not exactly a public API, so there should be some leeway here. But I think that whatever solution we propose here needs to address how we manage the change required of third parties.

I still need to fully read and understand this discussion - I've not really followed any of the work around publishing Python as a Store application, taking the view that I won't use it myself, so "it won't affect me". But now I'm starting to worry about whether I'll see pip bug reports being filed because pip's installing things in the wrong places ("Similarly, pip.exe for an installed Python 3.7 will mistakenly install into the virtual environment.") If that sort of thing does happen, it harms our credibility and we need to be able to clearky explain what's going on, and how to fix it.

Can I clarify 2 things:

1. Is there a possibility that 3rd party applications like pip could see bugs with Python 3.7.2?
2. Is there a workaround or fix for those issues, and how are we making sure the relevant 3rd parties know what to do?

(Again, apologies if I've misunderstood something here - there's quite a lot of technical detail that I haven't had time to understand yet).
msg334304 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-01-24 16:06
It might be worth adding a summary of that to the porting notes, but I think the actual impact will be minimal. Launching processes by relative path has been a bad decision for long enough now that I'd expect it to be rare, and the Scripts directory layout being slightly different helps.

The launcher scripts used by pip also changed somewhat recently (between 12.x and 18.x IIRC) with something along these lines without causing significant issues.

Ultimately, there isn't much of an alternative. Those who have been following good practices and supported patterns in their code will be okay, and those who have not (e.g. Numpy putting DLLs on PATH) will be bitten sooner or later anyway. 

So let's suppress the variable in the launcher to deal with that particular issue. It's fairly minor compared to the multiprocessing one.

For Paul: Yes, there's a chance of pip issues. This is why I sent you all a heads-up email before anything was released. But pip is certainly the best tested package and I've even been doing pip development using the Store package and a venv without trouble. Everything here is severe edge cases.

Also, the __PYVENV_LAUNCHER__ variable is totally new - previously it only ever existed on macOS (since some of the same launch restrictions apply there). It's not the same as the VIRTUALENV variable that you added support to py.exe to detect. That one is fine.
msg334313 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2019-01-24 18:16
Steve - thanks for the clarification. If it's only affecting launching with relative paths, then agreed it's minor (although I do recall recent discussions somewhere about using launchers with relative paths - that use case may need checking, but we can deal with that when it comes up).

And I hadn't appreciated that __PYVENV_LAUNCHER__ was MACOS only before now. So again, thanks forthe reassurance there.
msg334314 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-01-24 18:44
> 1. Is there a possibility that 3rd party applications like pip 
> could see bugs with Python 3.7.2?

A bug occurs when running an entry-point script, such as pip.exe, for a system- or user-installed Python 3.7.2+ from the context of a virtual environment for 3.7.2+. The script executable redirects to run `"path\to\python.exe" "path\to\<script_name>.exe"`. This will inherit __PYVENV_LAUNCHER__ and thus run in the virtual environment. The environment doesn't have to be activated. People use inactive virtual environments for various purposes, so it doesn't defy my expectations that someone may have a problem. I think the distlib launchers should make the same change to unset __PYVENV_LAUNCHER__, which is why I nosied Vinay.
msg334315 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2019-01-24 18:49
> The script executable redirects to run `"path\to\python.exe" "path\to\<script_name>.exe"`.

But if I understood what Steve said, only if path\to\python.exe is a *relative* pathname?
msg334317 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-01-24 19:26
> But if I understood what Steve said, only if path\to\python.exe is 
> a *relative* pathname?

I believe Steve is referring to behavior changes for applications that relied on a virtual environment's "Scripts" directory always being in the EXE and default DLL and generic search paths because it was the application directory (i.e. "%__APPDIR__%"), which I had discussed in a follow-up note. I suggested that this could be mitigated by having the new venv launchers also prepend their directory to PATH. It wouldn't help in all cases, but it's the best we can do. 

The issue with __PYVENV_LAUNCHER__ and the py.exe and entry-point launchers is unrelated to relative paths. Perhaps it will clarify the situation to show how this variable is actually used in PC/getpathp.c. It's a pretty simple change:

    /* The launcher may need to force the executable path to a
     * different environment, so override it here. */
    pyvenv_launcher = _wgetenv(L"__PYVENV_LAUNCHER__");
    if (pyvenv_launcher && pyvenv_launcher[0]) {
        wcscpy_s(program_full_path, MAXPATHLEN+1, pyvenv_launcher);
    } else if (!GetModuleFileNameW(NULL, program_full_path, MAXPATHLEN)) {
        /* GetModuleFileName should never fail when passed NULL */
        return _Py_INIT_ERR("Cannot determine program path");
}

https://github.com/python/cpython/blob/v3.7.2/PC/getpathp.c#L535
msg334322 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-01-24 20:31
I think that snippet may have to change to accommodate the multiprocessing issue, where we need to stop replacing sys.executable with the venv launcher and have it point to the base python.exe (and then anyone launching sys.executable needs to preserve __PYVENV_LAUNCHER__, which is the default behavior).

Having done a code scan, there are quite a few places where we explicitly choose __PYVENV_LAUNCHER__ on macOS over sys.executable, so I guess we need to do those on Windows now as well. But that's a discussion for issue 35797
msg334374 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-01-25 23:00
New changeset adad9e68013aac166c84ffe4e23f3a5464f41840 by Steve Dower in branch 'master':
bpo-35811: Avoid propagating venv settings when launching via py.exe (GH-11677)
https://github.com/python/cpython/commit/adad9e68013aac166c84ffe4e23f3a5464f41840
msg334380 - (view) Author: miss-islington (miss-islington) Date: 2019-01-25 23:31
New changeset a6a8524bb1c78c7425346ec20ecffc02d1d02a79 by Miss Islington (bot) in branch '3.7':
bpo-35811: Avoid propagating venv settings when launching via py.exe (GH-11677)
https://github.com/python/cpython/commit/a6a8524bb1c78c7425346ec20ecffc02d1d02a79
History
Date User Action Args
2019-01-26 01:44:38steve.dowersetkeywords: patch, patch, patch
assignee: steve.dower
status: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-01-25 23:31:23miss-islingtonsetnosy: + miss-islington
messages: + msg334380
2019-01-25 23:00:42miss-islingtonsetpull_requests: + pull_request11508
2019-01-25 23:00:29miss-islingtonsetpull_requests: + pull_request11507
2019-01-25 23:00:15miss-islingtonsetpull_requests: + pull_request11506
2019-01-25 23:00:00steve.dowersetmessages: + msg334374
2019-01-25 20:47:40steve.dowersetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request11502
2019-01-25 20:47:26steve.dowersetkeywords: + patch
stage: test needed -> test needed
pull_requests: + pull_request11501
2019-01-25 20:47:14steve.dowersetkeywords: + patch
stage: test needed -> test needed
pull_requests: + pull_request11500
2019-01-24 20:31:19steve.dowersetmessages: + msg334322
2019-01-24 19:26:27eryksunsetmessages: + msg334317
2019-01-24 18:49:27paul.mooresetmessages: + msg334315
2019-01-24 18:44:32eryksunsetmessages: + msg334314
2019-01-24 18:16:12paul.mooresetmessages: + msg334313
2019-01-24 16:06:05steve.dowersetmessages: + msg334304
2019-01-24 15:44:22paul.mooresetmessages: + msg334301
2019-01-24 06:43:33eryksunsetmessages: + msg334283
2019-01-24 02:17:12steve.dowersetmessages: + msg334277
2019-01-24 00:48:16eryksuncreate