classification
Title: "pip install --user numpy" fails on Python from the Windows Store
Type: Stage: resolved
Components: Windows Versions: Python 3.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: brett.cannon, eryksun, mattip, paul.moore, stephtr, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2019-01-08 22:45 by mattip, last changed 2019-02-17 08:40 by mattip. This issue is now closed.

Messages (10)
msg333262 - (view) Author: mattip (mattip) * Date: 2019-01-08 22:45
After enabling Insider and installing Python3.7 from the Windows Store, I open a cmd window and do `pip install --user numpy` which runs to completion. But I cannot `import numpy`. 

The NumPy `mutiarray` c-extension module in the `numpy/core` directory depends on an `OpenBLAS` DLL that is installed into the `numpy/.libs` directory. But even after adding that directory to the `PATH` before running python (and checking with `depends.exe` that the `multiarray` c-extension module is now not missing any dependencies) I still cannot `import numpy`.

See also NumPy issue https://github.com/numpy/numpy/issues/12667
msg333552 - (view) Author: mattip (mattip) * Date: 2019-01-13 10:50
The difference in search order between apps from the app store and desktop applications may be relevant

https://docs.microsoft.com/en-us/windows/desktop/Dlls/dynamic-link-library-search-order#alternate-search-order-for-windows-store-apps
msg333723 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-01-15 18:18
I posted on the numpy thread: Most likely the DLL is failing to load, which the importer returns as "not found" (as it falls back on other search mechanisms and doesn't retain the error). I suggested loading it directly with ctypes to see if there's a better error indicator.
msg333732 - (view) Author: mattip (mattip) * Date: 2019-01-15 21:44
It seems changing os.environ['PATH'] is a security risk and is not allowed for Windows Store apps. The suggestion in the NumPy issue is to:

- use AddDllDirectory, (which is as accessable as os.environ['PATH'] but is not considered a security risk so far), but this requires using SetDefaultDllDirectories which breaks other things

- put any dlls required for the c-extension pyd in the same directory which means scipy and numpy will be using duplicate and potentially different OpenBLAS dlls, and whoever imports first wins

- load all the required dlls via LoadLibrary, meaning NumPy will have to export a windows-only API to SciPy so the latter can know where the DLL is.

I am glad NumPy only has one DLL, and not a dozen like QT or wxPython. 

Is there a PEP that describes the overall design of windows directory layout or a design guide for package authors with best practices for additional dll dependencies?
msg333735 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2019-01-15 22:37
> use AddDllDirectory, (which is as accessable as os.environ['PATH'] but is not considered a security risk so far)

The parenthical is incorrect. The user-specified DLL search directory is separate from PATH, and both appear in the default DLL search order when resolving relative paths. In more secure configurations, PATH is not used for DLL resolution.

> but this requires using SetDefaultDllDirectories which breaks other things

Specifically, it breaks any extension relying on the implicit default search order by enabling one of the more secure configurations.

> put any dlls required for the c-extension pyd in the same directory which means scipy and numpy will be using duplicate and potentially different OpenBLAS dlls, and whoever imports first wins

Doesn't scipy import numpy? Which means numpy wins every time. Or alternatively, put "-numpy" in the name of numpy's one and "-scipy" in the name of scipy's one, and you can have both.

> load all the required dlls via LoadLibrary, meaning NumPy will have to export a windows-only API to SciPy so the latter can know where the DLL is.

Perhaps that API could be exported via normal module import as is currently is? That way scipy can just "import numpy" to locate numpy?

Alternatively, if you do indeed need to have shared state with scipy, then you should come up with an API that they can depend on. This is how shared state normally works.

> Is there a PEP that describes the overall design of windows directory layout or a design guide for package authors with best practices for additional dll dependencies?

No, but there is a doc page that deserves an update: https://docs.python.org/3/extending/windows.html

If we make a dramatic change to CPython here, then there may be a PEP, but it should still defer to the documentation as that is what gets updated over time.

Currently, the best info comes from https://docs.microsoft.com/windows/desktop/Dlls/dynamic-link-library-search-order and awareness that only the LOAD_WITH_ALTERED_SEARCH_PATH flag is used when loading extension modules (see https://github.com/python/cpython/blob/master/Python/dynload_win.c#L221)


Since I just saw the confirmation at https://docs.microsoft.com/en-us/windows/desktop/Dlls/dynamic-link-library-search-order#search-order-using-load_library_search-flags, I don't think we can safely change the LoadLibraryEx option in CPython until we drop support for Windows 7 completely, as the update containing the new flags may not be installed. If/when we do that, it will break any extension relying on unsafe DLL search semantics (that is, anything appearing in the earlier section but not in this section).
msg335608 - (view) Author: mattip (mattip) * Date: 2019-02-15 13:11
Closing. It seems the days of modifying os.environ['PATH'] on windows are over, and packages need to adopt to calling AddDllDirectory. As long as python is built with ctypes, this is easy enough to adopt, even though there are some caveats. See the issue from anaconda https://github.com/ContinuumIO/anaconda-issues/issues/10628
msg335667 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-02-16 05:32
It may suit the needs of NumPy and SciPy to use an assembly for DLL dependencies. With an assembly it's possible for two DLLs with the same name to load in a process and possible for a DLL to extend the assembly search path with up to nine relative paths at load time. The target directory can be up to two levels above the DLL directory (e.g. "..\..\assembly_dir"). An assembly can thus be packaged as a common dependency for other packages, and packages can depend on different versions of the assembly.

For example, let's make a package that changes the _tkinter.pyd extension module to use a private assembly, which consists of the two DLL dependencies, "tcl86t.dll" and "tk86t.dll". 

Begin by copying "DLLs\_tkinter.pyd" to a package directory such as "Lib\site-packages\mytk". Modify the embedded #2 manifest in "_tkinter.pyd" (use mt.exe, or a GUI resource editor) to include a dependency on an assembly named "amd64_tcl_tk_8.6.6.0":

  <dependency>
    <dependentAssembly>
      <assemblyIdentity name="amd64_tcl_tk_8.6.6.0"
                        version="8.6.6.0"
                        type="win32"
                        processorArchitecture="amd64" />
    </dependentAssembly>
  </dependency>

Next, add the following component configuration file beside the extension module, named "_tkinter.pyd.2.config":

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <configuration>
      <windows>
        <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
          <probing privatePath="..\__winsxs__" />
        </assemblyBinding>
      </windows>
    </configuration>

This extends the assembly probing path that's used by the Fusion loader in the session server (csrss.exe). The Fusion loader probes for the assembly in four locations per directory. It checks for the assembly both as a DLL and as a manifest file, both in the directory and in a subdirectory that's named for the assembly. We'll be using a subdirectory with a manifest. 

"..\__winsxs__" resolves to "site-packages\__winsxs__". Create this directory and a subdirectory named "amd64_tcl_tk_8.6.6.0". To this, add the two DLL dependencies -- tcl86t.dll and tk86t.dll -- plus the following manifest file named "amd64_tcl_tk_8.6.6.0.manifest":

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0">
        <assemblyIdentity name="amd64_tcl_tk_8.6.6.0"
                          version="8.6.6.0"
                          type="win32"
                          processorArchitecture="amd64" />
        <file name="tcl86t.dll" />
        <file name="tk86t.dll" />
    </assembly>

That's all it takes. If configured properly, you should be able to import the extension module via `from mytk import _tkinter`.

This will work in a virtual environment. However, I haven't checked whether the loader handles private assemblies in the same way in a store app. That's off my radar.

> packages need to adopt to calling AddDllDirectory. As long as 
> python is built with ctypes, this is easy enough to adopt, even 
> though there are some caveats

Avoid using ctypes.windll in libraries. It caches the ctypes.WinDLL instance, which caches function pointers. Projects that use the same DLLs thus can interfere with each other by setting incompatible prototypes. It also doesn't allow us to enable use_last_error to get reliable error handling. Also, the DLL_DIRECTORY_COOKIE return value is a pointer type, not a 32-bit integer. Even if we're not using it to cleanup afterwards (i.e. AddDllDirectory; LoadLibraryExW; RemoveDllDirectory), which we should be doing, we need the full 64-bit value to reliably check for failure (NULL). By some fluke, the low DWORD of the cookie could be 0.

Here are the ctypes definitions using a private WinDLL instance and an errcheck function:

    import ctypes
    from ctypes import wintypes
    kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)

    LOAD_LIBRARY_SEARCH_DEFAULT_DIRS = 0x00001000

    DLL_DIRECTORY_COOKIE = wintypes.LPVOID

    def _errcheck_zero(result, func, args):
        if not result:
            raise ctypes.WinError(ctypes.get_last_error())
        return args

    kernel32.AddDllDirectory.errcheck = _errcheck_zero
    kernel32.AddDllDirectory.restype = DLL_DIRECTORY_COOKIE
    kernel32.AddDllDirectory.argtypes = (wintypes.LPCWSTR,)

    kernel32.RemoveDllDirectory.errcheck = _errcheck_zero
    kernel32.RemoveDllDirectory.argtypes = (DLL_DIRECTORY_COOKIE,)

    kernel32.LoadLibraryExW.errcheck = _errcheck_zero
    kernel32.LoadLibraryExW.restype = wintypes.HMODULE 
    kernel32.LoadLibraryExW.argtypes = (
        wintypes.LPCWSTR, wintypes.HANDLE, wintypes.DWORD)

Don't call SetDefaultDllDirectories. Use LoadLibraryExW(path, None, LOAD_LIBRARY_SEARCH_DEFAULT_DIRS).
msg335699 - (view) Author: mattip (mattip) * Date: 2019-02-16 17:17
@eryksun thanks for the details. I didn't realize assemblyBinding was supported by the generic runtime framework, the documentation https://docs.microsoft.com/en-us/dotnet/framework/deployment/how-the-runtime-locates-assemblies seems to suggest it is a dotnet feature. How can we confirm it is supported for older windows versions (XP) as well as app store installed apps?
msg335740 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-02-17 01:21
> I didn't realize assemblyBinding was supported by the generic runtime 
> framework, the documentation https://docs.microsoft.com/en-us/dotnet/
> framework/deployment/how-the-runtime-locates-assemblies seems to
> suggest it is a dotnet feature.

Native private assemblies are supported in XP, but the probing element [1] of an assemblyBnding isn't available until Windows 7, which restricts this to Python 3.6+. 

Alternatively, we could implement our own $ORIGIN-like support for Windows. Embed relative paths as a resource in the PYD file. If found, Python would internally call AddDllDirectory for each directory, load the extension with the flag LOAD_LIBRARY_SEARCH_DEFAULT_DIRS, and then remove the directories via RemoveDllDirectory. This wouldn't solve the problem of DLL name and version conflicts, however. For that, the import tables would have to be modified in every DLL. I think there's an existing project that implements something like that for Unix.

Even if we're able to load two versions of a DLL in a process, they may still conflict with each other over resources (e.g. a named pipe). DLLs have to be designed to support multiple instances per process. See Authoring a DLL for a Side-by-Side Assembly [2]. 

[1]: https://docs.microsoft.com/en-us/windows/desktop/SbsCs/application-configuration-files
[2]: https://docs.microsoft.com/en-us/windows/desktop/SbsCs/authoring-a-dll-for-a-side-by-side-assembly
msg335762 - (view) Author: mattip (mattip) * Date: 2019-02-17 08:40
I think the original problem we had with the AddDllDirectory approach was that once set, it seems to mitigate searching the os.environ['PATH'] for dll loading. Is that accurate? Would RemoveDllDirectory restore the ability to find DLLs along the system PATH?

> This wouldn't solve the problem of DLL name and version conflicts, however

Right, so far we are discussing the easier problem of adding search paths, not their order. This does become an issue for users who modify their system path in order to overcome the first problem, and end up pulling in the wrong version of a support DLL.
History
Date User Action Args
2019-02-17 08:40:05mattipsetmessages: + msg335762
2019-02-17 01:21:36eryksunsetmessages: + msg335740
2019-02-16 17:17:53mattipsetmessages: + msg335699
2019-02-16 05:32:31eryksunsetnosy: + eryksun
messages: + msg335667
2019-02-15 13:11:16mattipsetstatus: open -> closed
resolution: rejected
messages: + msg335608

stage: resolved
2019-01-15 22:37:35steve.dowersetmessages: + msg333735
2019-01-15 21:44:42mattipsetmessages: + msg333732
2019-01-15 18:18:41steve.dowersetmessages: + msg333723
2019-01-13 10:50:06mattipsetmessages: + msg333552
2019-01-10 13:14:28stephtrsetnosy: + stephtr
2019-01-09 04:46:51xtreaksettitle: "pip install --user numpy" fails on Python from the Windos Store -> "pip install --user numpy" fails on Python from the Windows Store
2019-01-08 22:45:20mattipcreate