Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msvcrt error when embedded #68617

Closed
erikflister mannequin opened this issue Jun 11, 2015 · 23 comments
Closed

msvcrt error when embedded #68617

erikflister mannequin opened this issue Jun 11, 2015 · 23 comments
Labels
OS-windows stdlib Python modules in the Lib dir topic-ctypes type-bug An unexpected behavior, bug, or error

Comments

@erikflister
Copy link
Mannequin

erikflister mannequin commented Jun 11, 2015

BPO 24429
Nosy @pfmoore, @tjguk, @zware, @eryksun, @zooba, @iritkatriel

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2020-12-01.09:44:41.525>
created_at = <Date 2015-06-11.10:59:30.000>
labels = ['ctypes', 'type-bug', 'library', 'OS-windows']
title = 'msvcrt error when embedded'
updated_at = <Date 2020-12-01.09:44:41.524>
user = 'https://bugs.python.org/erikflister'

bugs.python.org fields:

activity = <Date 2020-12-01.09:44:41.524>
actor = 'iritkatriel'
assignee = 'none'
closed = True
closed_date = <Date 2020-12-01.09:44:41.525>
closer = 'iritkatriel'
components = ['Library (Lib)', 'Windows', 'ctypes']
creation = <Date 2015-06-11.10:59:30.000>
creator = 'erik flister'
dependencies = []
files = []
hgrepos = []
issue_num = 24429
keywords = []
message_count = 23.0
messages = ['245162', '245175', '245186', '245188', '245189', '245193', '245194', '245195', '245197', '245201', '245282', '245321', '245322', '245330', '245335', '245406', '245409', '245423', '245427', '245476', '245481', '246754', '382225']
nosy_count = 8.0
nosy_names = ['paul.moore', 'tim.golden', 'zach.ware', 'eryksun', 'steve.dower', 'carlkl', 'erik flister', 'iritkatriel']
pr_nums = []
priority = 'normal'
resolution = 'out of date'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue24429'
versions = ['Python 2.7']

@erikflister
Copy link
Mannequin Author

erikflister mannequin commented Jun 11, 2015

normally, CDLL(find_library('c')) is fine. but when running embedded in a context that uses a different runtime version, this will cause an error (example: http://stackoverflow.com/questions/30771380/how-use-ctypes-with-msvc-dll-from-within-matlab-on-windows/).

using ctypes.cdll.msvcrt apparently finds the correct runtime. i was surprised by this, i thought this was supposed to be identical to find_library('c').

in any case, some libraries (uuid.py) use the one that breaks. can you either switch everything to ctypes.cdll.msvcrt, or have find_library('c') change to be identical to it?

@erikflister erikflister mannequin added stdlib Python modules in the Lib dir OS-windows topic-ctypes type-bug An unexpected behavior, bug, or error labels Jun 11, 2015
@zooba
Copy link
Member

zooba commented Jun 11, 2015

msvcrt isn't the right version, it just happens to load. It's actually an old, basically unsupported version.

The problem would seem to be that Python 2.7 does not activate its activation context before loading msvcrt90 via ctypes. Eryksun (nosied - hope you're the same one :) ) posted a comment on the SO post with a link to a separate answer that shows how to do it, but it would be better for MATLAB to embed the manifest in their host executable if they're going to load the DLL directly.

We could probably also condition uuid to not do that check on Windows, since I don't think those functions will ever exist, at least against 2.7 they won't.

@erikflister
Copy link
Mannequin Author

erikflister mannequin commented Jun 11, 2015

it would be better for MATLAB to embed the manifest in their host executable if they're going to load the DLL directly.

can you help me understand? as far as i could tell, we need python to use the msvcr*.dll that comes with matlab, not v/v.

it's hard (as a customer) to get mathworks (matlab's author) to do anything, but if the fix would best be done by them, they might listen to "official python muckity-mucks," especially since their python integration is relatively new... no idea how to find out who to contact there though.

@zooba
Copy link
Member

zooba commented Jun 11, 2015

Python needs to be recompiled to use a different CRT, and that will break all existing extension modules (.pyd's). That said, in some situations it is the right answer, typically because existing extension modules would be broken anyway, but I don't think that applies here.

To load msvcr90.dll, you need to declare in your executable which version you want to use using a manifest. This enables side-by-side use of the CRT, so different programs can use different versions and they are all managed by the operating system (for security fixes, etc.). Otherwise, you get hundreds of copies of the CRT and they are likely to be lacking the latest patches.

A way to hack in the manifest is to put it alongside the executable. You could take the file from http://stackoverflow.com/questions/27389227/how-do-i-load-a-c-dll-from-the-sxs-in-python/27392347#27392347 and put it alongside the MATLAB executable as "matlab.exe.manifest" or whatever, which avoids having to get Mathworks involved, and that might allow you to load msvcr90.dll. If they've already embedded a manifest into the executable (which is very likely), then I don't know which one will win or what effects may occur if the original one is ignored.

@erikflister
Copy link
Mannequin Author

erikflister mannequin commented Jun 11, 2015

thanks - i still don't understand tho. if python would have to be recompiled to use a different crt, why wouldn't matlab? if a manifest could fix matlab, why couldn't one fix python?

i ran into all this trying to get shapely to load in matlab, and using msvcrt instead of find_library('c') solved it there:
shapely/shapely#104 (comment)

that solution seems so much easier than any of this manifest/sxs stuff -- but you're saying it's wrong?

sorry i'm slow, never dealt with any of this stuff before...

@zooba
Copy link
Member

zooba commented Jun 11, 2015

python.exe already has the manifest it needs, but it can't be embedded into python27.dll - it has to go into the exe file. That's why Python can't make it so that msvcr90.dll is loaded.

Depending on what you're using it for, the C Runtime may keep some state in between function calls. For things like string copying (with no locale) you'll be okay, but most of the complication stuff assumes that every call into the CRT is calling into the *same* CRT. When you load different CRTs at the same time (as is happening here already, or when you load mscvrt.dll directly), you have to be very careful not to intermix them together at all.

The most obvious example is open file handles. If you open a file with CRT 9.0 (msvcr90.dll) and then try and read from it with CRT 6.0 (msvcrt.dll), you'll probably crash or at least corrupt something. The same goes for memory allocations - if CRT 9.0 does a malloc() and then CRT 10.0 does the free(), you're almost certainly going to corrupt something because they are not compatible.

I suspect Mathworks is relying on people installing Python themselves so they don't have to redistribute it as part of MATLAB, which is totally fine, but you have to be prepared to deal with this situation. If they make their own build, they need to distribute it themselves (easy) and explain to people why numpy doesn't work anymore unless you use their special build of numpy too (ie. because it uses a different CRT).

Like I said initially, we would probably accept a patch for uuid.py to skip the CRT scan on Windows, and similar changes like that where appropriate. If you need to be able to load the DLL yourself, you either need to carefully consider how the functions you call may interact with other implementations/versions that may be loaded, or set up the manifest so you can load msvcr90.dll.

@erikflister
Copy link
Mannequin Author

erikflister mannequin commented Jun 11, 2015

am i reading this wrong, that you can put the manifest into the .dll?
https://msdn.microsoft.com/en-us/library/ms235560(v=vs.90).aspx

@erikflister
Copy link
Mannequin Author

erikflister mannequin commented Jun 11, 2015

if it can't go into your .dll, what are libraries like shapely supposed to do? tell their users to do all this manifest stuff if they're running embedded?

@zooba
Copy link
Member

zooba commented Jun 11, 2015

Ah, it can go into the DLL, and it's already there. The problem may be that there is conflicting information about which resource ID - https://msdn.microsoft.com/en-us/library/aa374224(v=vs.90).aspx says it should be 1 while your link says 2.

python27.dll has the manifest as resource 2, so if that is incorrect, then that could be a reason why it's not working. (Looking at the search paths in that link above, there are other potential reasons if it's finding a compatible assembly in the MATLAB folder, but it sounds like that's not the case.)

I guess we need someone with the patience to go through and figure out exactly whether it should be 1 or 2. That person is not me, sorry.

@erikflister
Copy link
Mannequin Author

erikflister mannequin commented Jun 11, 2015

@erikflister
Copy link
Mannequin Author

erikflister mannequin commented Jun 13, 2015

well i can confirm @eryksun's method works, so it's not a problem with how the manifest is included in the dll. to me, the real issue is that ctypes.cdll.msvcrt and find_library('c') aren't correct. the first returns something "old and unsupported," or "officially off-limits" (depending who you ask), and the second doesn't work when running embedded. imho, both of them should use @eryksun's method to activate the dll's context, look in the included manifest, and return the msvcr* found there. why isn't this the correct design? why should every library have to reimplement the method just to allow running embedded, which they can't be responsible for knowing about?

@zooba
Copy link
Member

zooba commented Jun 13, 2015

About the only possible solution here would be to special case ctypes to detect msvcr90 as a parameter (later versions of the CRT don't need it) and also whether another activation context already exists. We could also document the need for a complete manifest in the embedding docs. All of this really only affects 2.7, as later versions of Python don't necessarily suffer the same limitation (unless someone wants to load msvcr90 explicitly).

What functionality do you need that you can't get some other way (such as the msvcrt module)? Or is it just the uuid issue?

@erikflister
Copy link
Mannequin Author

erikflister mannequin commented Jun 13, 2015

About the only possible solution here would be to special case ctypes to detect msvcr90 as a parameter (later versions of the CRT don't need it) and also whether another activation context already exists. We could also document the need for a complete manifest in the embedding docs.

i'm not following why it's a special case, or why later versions wouldn't have the same problem? isn't this a problem for any DLLs that the embedding context may have loaded that would conflict with DLLs that python depends on? python's DLL already has the necessary "complete manifest," right? as long as CDLL does the proper context manipulations, client code shouldn't have to worry about whether it's running embedded, right?

what is the purpose of ctypes.cdll.msvcrt if no one is supposed to use it? there is also "import msvcrt" which is apparently a subset of what you get from find_library('c'), so would need the same fix?

All of this really only affects 2.7, as later versions of Python don't necessarily suffer the same limitation (unless someone wants to load msvcr90 explicitly).

what changed that avoids the problem? perhaps that fix can be applied to 2.7?

What functionality do you need that you can't get some other way (such as the msvcrt module)? Or is it just the uuid issue?

innocent ol' me was just trying to import shapely from matlab - they call find_library('c') and need the 'free' function. i don't think they ever malloc -- they depend on a geos_c.dll, which must do the allocations and is built on whatever msvcrt was used for python? probably a better design would be for geos_c.dll to export its own free function? but afaiu, geos_c.dll comes from a totally different (more legacy?) project, not python related... shapely is a dependency of the library i actually need, which also uses uuid. uuid is the only case i can find in the standard libraries that also calls find_library('c'). manually changing those calls allowed me to successfully import everything from matlab. ctypes and distutils also mention msvc* a lot, obviously... getpass.py and multiprocessing and subprocess use "import msvcrt".

@zooba
Copy link
Member

zooba commented Jun 13, 2015

i'm not following why it's a special case, or why later versions wouldn't have the same problem?

The Microsoft C Runtime 9.0 required an activation context to allow multiple versions to load side by side. This turned out to be more trouble than it was worth, and so version 10.0 removed that requirement.

isn't this a problem for any DLLs that the embedding context may have loaded that would conflict with DLLs that python depends on?

For any DLL that requires a version specification in the current activation context, yes. These are fairly rare, but if the DLL checks, then the context needs to be created for it. (MSVCRT 9.0 requires it and checks - hence the error when it isn't set up.)

python's DLL already has the necessary "complete manifest," right?

In theory yes, but apparently it isn't working in this case. It needs more investigation to figure out why.

what is the purpose of ctypes.cdll.msvcrt if no one is supposed to use it?

ctypes.cdll.msvcrt doesn't really exist - ctypes.cdll turns it into a LoadLibrary("msvcrt") call that works because Windows keeps shipping msvcrt.dll for backwards compatibility (for applications that rely on msvcrt.dll entirely - not piecemeal).

there is also "import msvcrt" which is apparently a subset of what you get from find_library('c'), so would need the same fix?

No, because this module is built into Python's DLL (and does properly conversion from Python types to C types, which occasionally differ from the ctypes conversions). If you've been able to load Python, these functions will be fine.

what changed that avoids the problem? perhaps that fix can be applied to 2.7?

Python 3.0-3.2 are also affected, but Python 3.3 and later use newer versions of the CRT that do not have the manifest requirement. It's been discussed in the past and has been decided that the official builds of Python will not change compiler version without a version bump (in this case, 2.7->2.8 would be required, but has been ruled out).

innocent ol' me was just trying to import shapely from matlab - they call find_library('c') and need the 'free' function. i don't think they ever malloc -- they depend on a geos_c.dll, which must do the allocations and is built on whatever msvcrt was used for python? probably a better design would be for geos_c.dll to export its own free function? but afaiu, geos_c.dll comes from a totally different (more legacy?) project, not python related...

Yeah, geos_c.dll really should have exported its own free() function. find_library('c') is probably the wrong approach here - if geos_c.dll is being rebuilt with different CRTs at all then the free() function should be added to it, and if it's truly legacy and is no longer being rebuilt then the version of the CRT it uses should be loaded explicitly. It isn't automatically getting the same version as whatever version of Python is running, that's for sure.

uuid is the only case i can find in the standard libraries that also calls find_library('c').

As I said earlier, I'm sure we'd accept a patch to uuid.py to avoid that call on Windows (or completely remove it - I was sure at one point that ctypes was considered off-limits for the stdlib). Everything ought to be going through "import msvcrt" or their own extension modules, and it sounds like they mostly are.

@eryksun
Copy link
Contributor

eryksun commented Jun 14, 2015

> python's DLL already has the necessary "complete manifest," right?

In theory yes, but apparently it isn't working in this case. It
needs more investigation to figure out why.

The manifest in the DLL is stored as resource ID 2. This secondary manifest is used by the loader to create a temporary activation context while python27.dll is loaded. Thus allows it to load msvcr90.dll.

python27!DllMain stores the current activation context. This gets reactivated when loading extension modules. Thus when Python 2.7 is embedded, there's no problem loading extensions that depend on msvcr90.dll, such as _ctypes.pyd.

If _Py_ActivateActCtx and _Py_DeactivateActCtx were exported, they could be called in _ctypes!load_library. That should solve this problem with using ctypes.CDLL('msvcr90') in embedded Python.

Windows keeps shipping msvcrt.dll for backwards compatibility (for
applications that rely on msvcrt.dll entirely - not piecemeal).

Windows itself is the primary user of msvcrt.dll. A Windows 7 installation has over 1500 DLLs and over 350 executables in System32 that depend on msvcrt.dll. Windows developers such as Raymond Chen get a bit annoyed when projects link directly with msvcrt.dll. See Windows is not a Microsoft Visual C/C++ Run-Time delivery channel.

Yeah, geos_c.dll really should have exported its own free()
function.

Each CRT uses a private heap, so mismatching free() and malloc() from different CRTs is wrong. geos_c really should export a free() function. Actually, it really should have the user allocate data.

@carlkl
Copy link
Mannequin

carlkl mannequin commented Jun 15, 2015

Windows itself is the primary user of msvcrt.dll.
A Windows 7 installation has over 1500 DLLs and over
350 executables in System32 that depend on msvcrt.dll.
Windows developers such as Raymond Chen get a bit annoyed
when projects link directly with msvcrt.dll.

In case of mingw32 or mingw-w64 msvcrt linkage is the usual standard due to licensing reasons. The CRT has to be stated as a 'System' library, see http://www.gnu.org/licenses/gpl-faq.html#WindowsRuntimeAndGPL. This is case for msvcrt.dll only. VC runtimes can be linked as well, but this runtime DLLs should'nt deployed alongsinde with the application in this case.

As described above python binary extensions has to be linked against the very same VC runtime that is used for Python itself to avoid mixing runtimes in one application. Mixing is considered as evil, see http://siomsystems.com/mixing-visual-studio-versions

An important question for Steve concerning python-3.5:

python-3.5b2 is linked against the newly introduced 'universal CRT', that is without any doubt a SYSTEM LIBRARY. However, heap memory managment functions and other functions are linked against VCRUNTIME140.dll instead of the ucrtbase.dll. Is this the intended behavior?

The symbol memset: this symbol is exposed from ucrtbase.dll as well as vcruntime140.dll. Is it necessary to link python binaries against vcruntime140.dll as well, or is linkage against ucrtbase.dll sufficient?

@zooba
Copy link
Member

zooba commented Jun 15, 2015

python-3.5b2 is linked against the newly introduced 'universal CRT', that is without any doubt a SYSTEM LIBRARY. However, heap memory managment functions and other functions are linked against VCRUNTIME140.dll instead of the ucrtbase.dll. Is this the intended behavior?

AFAICT, all of the "public" functions exported from vcruntime140.dll are also exported from api-ms-win-crt-string-l1-1-0.dll (which forwards to ucrtbase.dll), which would make them available as part of the stable ABI. I'm not sure why vcruntime140.dll has its own versions or why they are used in preference, but it may be to do with inlining or intrinsics.

vcruntime140.dll exists and is not guaranteed stable because it provides functionality that needs intimate knowledge of the compiler (stack unwinding, etc.). Those string APIs don't make much sense here, so I'd guess they're dependencies that had to be pulled in, and the linker may just be prioritizing those ones by accident.

I would not be at all surprised if MinGW had to replace vcruntime140.dll entirely. Nothing from ucrtbase.dll can depend on it, so replacing it is probably for the best. Then just link against either the ucrtbase.dll or the api-ms-win-crt-*.dll libraries.

@erikflister
Copy link
Mannequin Author

erikflister mannequin commented Jun 16, 2015

thanks a lot for the detailed info steve, very clearly stated!

Yeah, geos_c.dll really should have exported its own free() function. find_library('c') is probably the wrong approach here - if geos_c.dll is being rebuilt with different CRTs at all then the free() function should be added to it, and if it's truly legacy and is no longer being rebuilt then the version of the CRT it uses should be loaded explicitly. It isn't automatically getting the same version as whatever version of Python is running, that's for sure.

well, shapely's installation instructions from windows are to use chris gohlke's prebuilt binaries from here: http://www.lfd.uci.edu/~gohlke/pythonlibs/

i assume he's coordinating the crt versions? apparently a lot of people use these.

i'm not clear on why gohlke's stuff is necessary, and why pypi/pip/distutils is not adequate -- shapely is the only library i've run into that needed gohlke's binaries. of course, i didn't try to install numpy/scipy manually, the internet said that this is hard on windows, and to just use something like winpython/pythonxy. are these problems all related to this crt issue?

@zooba
Copy link
Member

zooba commented Jun 16, 2015

i assume he's coordinating the crt versions? apparently a lot of people use these.

So do I :) He's definitely got access to the correct compiler versions, so I'm sure he's using them (via distutils/setuptools, which will always try to use the correct one).

i'm not clear on why gohlke's stuff is necessary, and why pypi/pip/distutils is not adequate

It's not necessarily easy to get exactly the right compiler, and since Python generally relies on old and outdated ones (because 2.7 lives forever and cannot change) people often need multiple versions installed at the same time.

pip+wheel is adequate once library developers publish wheels (or republish Gohlke's wheel of their library). pip+distutils is very fiddly.

shapely is the only library i've run into that needed gohlke's binaries. of course, i didn't try to install numpy/scipy manually, the internet said that this is hard on windows, and to just use something like winpython/pythonxy. are these problems all related to this crt issue?

numpy and scipy are due to requiring a Fortran compiler. The Intel compiler is compatible with MSVC, but does not have a Free(tm) license, while gfortran (gcc) does and is not strictly compatible with MSVC (there are some MinGW forks that are very close though).

So in effect, yes, the fact that the CRT has to match in every pre-built binary is the problem (less of a problem on Linux because nobody ever imagines that the C runtime might be compatible, and so everyone needs a compiler all the time - therefore, compiling is easier than distribution, whereas on Windows distribution is easier than compiling </oversimplification>).

@eryksun
Copy link
Contributor

eryksun commented Jun 18, 2015

shapely's installation instructions from windows are to use
chris gohlke's prebuilt binaries from here:
http://www.lfd.uci.edu/~gohlke/pythonlibs/

Christoph Gohlke's Shapely‑1.5.9‑cp27‑none‑win_amd64.whl includes a version of geos_c.dll that has the VC90 manifest embedded as resource 2, just like python27.dll. The DLL also exports a GEOSFree function, which is what shapely actually uses. That said, the geos.py module still defines a global free() using cdll.msvcrt.free. As far as I can see, it never actually calls it. Otherwise it would surely crash the process due to a heap mismatch.

Steve, since you haven't closed this issue, have you considered my suggestion to export _Py_ActivateActCtx and _Py_DeactivateActCtx for use by C extensions such as _ctypes.pyd? These functions are better than manually creating a context from the manifest that's embedded in python27.dll because they use the context that was active when python27.dll was initially loaded.

@zooba
Copy link
Member

zooba commented Jun 18, 2015

Steve, since you haven't closed this issue, have you considered my suggestion to export _Py_ActivateActCtx and _Py_DeactivateActCtx for use by C extensions such as _ctypes.pyd? These functions are better than manually creating a context from the manifest that's embedded in python27.dll because they use the context that was active when python27.dll was initially loaded.

I'm always fairly slow to close issues - don't read too much into that :)

I don't see any value in exporting them for other C extensions, since they can also capture the initial context when they are loaded. They really need exports for Python. windll.python27._Py_ActivateActCtx would suffice and I wouldn't want to go any further than that - this is *very* advanced functionality that I would expect most people to get wrong.

Someone prepare a patch. I'm not -1 yet (and of course, I'm not the sole gatekeeper here, so one of the core devs who's still working on 2.7 can fix it too).

@eryksun
Copy link
Contributor

eryksun commented Jul 15, 2015

windll.python27._Py_ActivateActCtx would suffice

It would instead be ctypes.pythonapi._Py_ActivateActCtx -- if the DLL exported a function with this name. ctypes.pythonapi is a PyDLL instance that wraps sys.dllhandle.

I think it would be more useful in general to add an "actctx" parameter to CDLL. Then make PyWin_DLLhActivationContext public in PC/dl_nt.c, and add it as sys.dllactctx. Example usage:

    libc = CDLL('msvcr90', actctx=sys.dllactctx)

Along the lines of changing CDLL, it would also be nice to add a "flags" parameter and switch to using LoadLibraryEx. In comparison, POSIX users have easy access to the "mode" parameter (i.e. RTLD_LOCAL, RTLD_GLOBAL).

@iritkatriel
Copy link
Member

This is a python 2.7-only issue.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-windows stdlib Python modules in the Lib dir topic-ctypes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants