msg365887 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2020-04-07 03:27 |
In several recent PRs, test_ctypes.test_load_dll_with_flags is failing for the Azure Pipelines "Windows PR tests win32" and "Windows PR tests win64" with the following error message:
```
======================================================================
ERROR: test_load_dll_with_flags (ctypes.test.test_loading.LoaderTest) [WinDLL('_sqlite3.dll', winmode=0)]
----------------------------------------------------------------------
Traceback (most recent call last):
File "d:\a\1\s\lib\ctypes\test\test_loading.py", line 140, in should_pass
subprocess.check_output(
File "d:\a\1\s\lib\subprocess.py", line 419, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "d:\a\1\s\lib\subprocess.py", line 533, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['d:\\a\\1\\s\\PCbuild\\win32\\python.exe', '-c', "from ctypes import *; import nt;WinDLL('_sqlite3.dll', winmode=0)"]' returned non-zero exit status 1.
======================================================================
ERROR: test_load_dll_with_flags (ctypes.test.test_loading.LoaderTest) [WinDLL('_sqlite3.dll', winmode=0)]
----------------------------------------------------------------------
Traceback (most recent call last):
File "d:\a\1\s\lib\ctypes\test\test_loading.py", line 140, in should_pass
subprocess.check_output(
File "d:\a\1\s\lib\subprocess.py", line 419, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "d:\a\1\s\lib\subprocess.py", line 533, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['d:\\a\\1\\s\\PCbuild\\win32\\python.exe', '-c', "from ctypes import *; import nt;WinDLL('_sqlite3.dll', winmode=0)"]' returned non-zero exit status 1.
```
Since this only started occurring recently in several unrelated PRs, I suspect it was most likely an intermittent regression introduced in master. Here are the PRs I have seen the same exact error occur in so far:
https://github.com/python/cpython/pull/18239
https://github.com/python/cpython/pull/19403
https://github.com/python/cpython/pull/19402
https://github.com/python/cpython/pull/19399
I was unable to reproduce it locally on my secondary boot of Windows 10.
|
msg365890 - (view) |
Author: Zachary Ware (zach.ware) * |
Date: 2020-04-07 06:09 |
My best guess at the moment is that something changed in the underlying Windows image, particularly given the comment above the failing line that "insecure load flags should succeed" (though that could easily be a red herring). Until someone who actually does understand what's going on here (Steve? :)) can deal with it properly, I have a PR to disable the problematic part of the test that I'll merge shortly.
|
msg365892 - (view) |
Author: Zachary Ware (zach.ware) * |
Date: 2020-04-07 06:40 |
New changeset f407e209c1e35b64835f73e7e7ca23e33817e9fe by Zachary Ware in branch 'master':
bpo-40214: Temporarily disable a ctypes test (GH-19404)
https://github.com/python/cpython/commit/f407e209c1e35b64835f73e7e7ca23e33817e9fe
|
msg365900 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2020-04-07 10:04 |
Zachary Ware wrote:
> My best guess at the moment is that something changed in the underlying Windows image, particularly given the comment above the failing line that "insecure load flags should succeed" (though that could easily be a red herring).
Also, now that I think about it, the the test failure has only specifically occurred on Azure Pipelines and not on the GitHub Actions Windows PR tests (as far as I have seen), which does further increase the likelihood that it's an issue with the Windows image being used in Azure for the PR tests.
Regardless of the cause though, thanks for applying a temporary fix/skip in the meantime, these types of issues can be troublesome for CPython's workflow when it occurs on required CI checks. Hopefully Steve Dower or someone similarly knowledgeable in the area can definitively figure out the cause.
|
msg365902 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-07 10:29 |
Maybe something else was installed that put an incompatible _sqlite3.dll on PATH? Thereby proving the inherent risk of using unsafe DLL load settings :)
If we're not already overriding PATH for the subprocess in this test (should only need "%SystemRoot%;%SystemRoot%\System32"), we may have to. Though if someone is installing an incompatible _sqlite3.dll into a system directory then we're in a bit of trouble.
Disabling this part of the test as inherently untestable because it's an inherently unsafe operation is fine by me :)
|
msg365910 - (view) |
Author: Eryk Sun (eryksun) * |
Date: 2020-04-07 15:42 |
> Maybe something else was installed that put an incompatible
> _sqlite3.dll on PATH? Thereby proving the inherent risk of
> using unsafe DLL load settings
should_pass() sets the working directory of the test process to the `tmp` directory. The loader normally checks the working directory before PATH. But maybe the system itself is configured to disallow loading DLLs from the working directory. There's a registry setting for that, but it's little known and rarely used because it's disruptive in general.
|
msg365919 - (view) |
Author: Zachary Ware (zach.ware) * |
Date: 2020-04-07 17:03 |
Assuming I implemented my checks correctly (see PR19413), I think I've just debunked both of our leading theories.
Results here: https://dev.azure.com/Python/cpython/_build/results?buildId=60764&view=logs&j=d554cd63-f8f4-5b2d-871b-33e4ea76e915&t=5a14d0eb-dbd4-5b80-f5d0-7909f950a1cc
|
msg365953 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-07 22:14 |
It's probably the sqlite3.dll dependency that's failing, not the _sqlite3.dll (originally _sqlite3.pyd) one.
The test is verifying that dependent DLLs are loaded correctly. So I assume in this case the _sqlite3.dll is being loaded, but it's finding the wrong sqlite3.dll.
Since I could, I updated Zach's PR to check for other sqlite3.dll files. The new build should be at https://dev.azure.com/Python/cpython/_build/results?buildId=60785&view=logs&j=c83831cd-3752-5cc7-2f01-8276919eb334 in a few minutes.
|
msg365954 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-07 22:16 |
It might also be a "sqlite3_d.dll". Updated build is https://dev.azure.com/Python/cpython/_build/results?buildId=60787&view=results
|
msg365955 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-07 22:34 |
Found it (and it's kind-of us):
Checking C:\Program Files\Amazon\AWSCLIV2\
*** Found at C:\Program Files\Amazon\AWSCLIV2\sqlite3.dll
*** Found at C:\Program Files\Amazon\AWSCLIV2\_sqlite3.pyd
But I'm not sure why that is getting loaded earlier than the current directory. Is that the behaviour we went for here?
(FWIW, we build and test a release build, not a debug build, which is why we're not looking for sqlite3_d.dll... but perhaps we should be using a debug build? That might be slower, but the extra validation is probably worthwhile...)
|
msg365970 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2020-04-08 04:05 |
Steve Dower wrote:
> (FWIW, we build and test a release build, not a debug build, which is why we're not looking for sqlite3_d.dll... but perhaps we should be using a debug build? That might be slower, but the extra validation is probably worthwhile...)
In general, I think a debug build instead of a release would make more sense for the PR tests. Do you have a general estimate or rough idea as to how much slower it would be in comparison?
|
msg365971 - (view) |
Author: Karthikeyan Singaravelan (xtreak) * |
Date: 2020-04-08 04:28 |
Is this going to be backported? It seems backports also use the same build and have this error. Sample 3.8 backport build that seems to be related to this issue : https://dev.azure.com/Python/cpython/_build/results?buildId=60753&view=logs&j=d554cd63-f8f4-5b2d-871b-33e4ea76e915&t=5a14d0eb-dbd4-5b80-f5d0-7909f950a1cc&l=1570
|
msg365993 - (view) |
Author: Zachary Ware (zach.ware) * |
Date: 2020-04-08 16:14 |
Feel free to backport PR19404 as needed, but mark versions here appropriately to make sure the *real* fix makes it where it needs to go.
|
msg365999 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-08 18:31 |
> Do you have a general estimate or rough idea as to how much slower it would be in comparison?
It's one sample point, but compare https://buildbot.python.org/all/#/builders/129/builds/708 to https://github.com/python/cpython/runs/571497886
Compile time: 3:32 (release) -> 1:10 (debug)
Test time: 12:28 (release) -> 15:31 (debug)
Though the test timing vary wildly, as some tests cause more contention than others and they run in a random order.
|
msg366003 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2020-04-08 20:03 |
Steve Dower wrote:
> It's one sample point, but compare https://buildbot.python.org/all/#/builders/129/builds/708 to https://github.com/python/cpython/runs/571497886
FWIW, I'd be +1 in favor for using the debug build then. A few additional minutes would be well worth having more thorough PR tests IMO. If it becomes an issue, we can always revert it back, but it seems that there's little to no harm in trying it out.
|
msg366044 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2020-04-09 12:27 |
Python 3.8 is also affected. Example:
https://github.com/python/cpython/pull/19444#issuecomment-611499825
I backported the change to skip the test until it's fixed: PR 19449.
|
msg366045 - (view) |
Author: miss-islington (miss-islington) |
Date: 2020-04-09 12:31 |
New changeset c83f003ee5398e9c27a0c634329420003d607d46 by Miss Islington (bot) in branch '3.8':
bpo-40214: Temporarily disable a ctypes test (GH-19404)
https://github.com/python/cpython/commit/c83f003ee5398e9c27a0c634329420003d607d46
|
msg366307 - (view) |
Author: Eryk Sun (eryksun) * |
Date: 2020-04-13 12:02 |
> But I'm not sure why that is getting loaded earlier than the
> current directory. Is that the behaviour we went for here?
I don't understand what's going on here if %PATH% is interfering. The current directory (%__CD__%) should be checked before %PATH% with the standard DLL search order for desktop applcations. That's what I've observed in practice and how it's documented [1]:
SafeDllSearchMode enabled:
1. %__APPDIR__%
2. %SystemRoot%\System32
3. %SystemRoot%\System
4. %SystemRoot%
5. %__CD__%
6. %PATH%
SafeDllSearchMode disabled:
1. %__APPDIR__%
2. %__CD__%
3. %SystemRoot%\System32
4. %SystemRoot%\System
5. %SystemRoot%
6. %PATH%
[1]: https://docs.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-search-order#search-order-for-desktop-applications
|
msg366316 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-13 14:47 |
> I don't understand what's going on here if %PATH% is interfering.
Yeah, as shown by the fact my fix didn't fix it :)
I'm going to try adding some more diagnostics around this to find out exactly where it's failing. I suspect it's a dependency, rather than the direct file.
|
msg366319 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-13 17:41 |
So I collected more info, and it seems the search order we're getting for _sqlite3.dll (.pyd) is:
* app directory
* SQL server install directory (?)
* Windows/System32
* Windows
* %PATH% (which is truncated at 2190 chars)
And then we never look in CWD. I wonder if it's being excluded because it's under %TEMP%?
I can't reproduce it either with a long PATH or with it in TEMP, so there may be a policy setting involved. Trying again with the test not using TEMP (it shouldn't be using the real TEMP anyway...)
|
msg366320 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-13 18:07 |
Okay, I've wasted enough time on this. Let's just disable the test permanently and say that the winmode=0 behaviour is system-specific and is not tested.
|
msg366322 - (view) |
Author: Eryk Sun (eryksun) * |
Date: 2020-04-13 18:44 |
> it seems the search order we're getting for _sqlite3.dll (.pyd) is:
>
> * app directory
> * SQL server install directory (?)
> * Windows/System32
> * Windows
> * %PATH% (which is truncated at 2190 chars)
Thank you. It finally makes sense. The parent process is calling SetDllDirectoryW [1], which also replaces the current working directory in the DLL search path of child processes (inherited via the PEB ProcessParameters->DllPath) [2]:
Note: The standard search order of the process will also be
affected by calling the SetDllDirectory function in the parent
process before start of the current process.
In that case, all we have to do is restore the original search path by calling SetDllDirectoryW(NULL). For example:
import ctypes
kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
if not kernel32.SetDllDirectoryW(None):
raise ctypes.WinError(ctypes.get_last_error())
[1]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-setdlldirectoryw
[2]: https://docs.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-search-order#alternate-search-order-for-desktop-applications
|
msg366324 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-13 19:02 |
You mean the CI agent is doing it? Because there's nowhere in Python that should be doing it.
I'd rather not blanket change this option in the test suite (except to turn it on :) ), so that code snippet probably needs to shrink down into a one-liner that can go with the "-c" just for these tests.
|
msg366327 - (view) |
Author: Eryk Sun (eryksun) * |
Date: 2020-04-13 19:35 |
> You mean the CI agent is doing it? Because there's nowhere in
> Python that should be doing it.
The ProcessParameters->DllPath value is inherited until a process in the tree reverts to the original search path via SetDllDirectoryW(NULL), so it can be any ancestor process, not just the immediate parent process.
> I'd rather not blanket change this option in the test suite (except to
> turn it on :) ), so that code snippet probably needs to shrink down
> into a one-liner that can go with the "-c" just for these tests.
The test could call GetDllDirectoryW [1] to save and restore the previous value. Unfortunately, GetDllDirectoryW returns 0 for both the default case (NULL) and when the working directory is disabled by setting an empty string (""). It would have to assume one or the other.
Otherwise if you just want a one-liner: "import ctypes; ctypes.windll.kernel32.SetDllDirectoryW(None)".
[1]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-getdlldirectoryw
|
msg367013 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-22 16:04 |
New changeset 9b498939009f49b8c772c89e8fc80efbfd8afcb5 by Steve Dower in branch 'master':
bpo-40214: Fix ctypes WinDLL test with insecure flags (GH-19652)
https://github.com/python/cpython/commit/9b498939009f49b8c772c89e8fc80efbfd8afcb5
|
msg367014 - (view) |
Author: Steve Dower (steve.dower) * |
Date: 2020-04-22 16:05 |
Looks like that fix has done it. Thanks, Eryk!
|
msg369661 - (view) |
Author: miss-islington (miss-islington) |
Date: 2020-05-22 22:24 |
New changeset 0cc7becde0bfe896fd23b5cb14fedfb8f2066fca by Miss Islington (bot) in branch '3.8':
bpo-40214: Fix ctypes WinDLL test with insecure flags (GH-19652)
https://github.com/python/cpython/commit/0cc7becde0bfe896fd23b5cb14fedfb8f2066fca
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:29 | admin | set | github: 84395 |
2020-05-22 22:24:49 | miss-islington | set | messages:
+ msg369661 |
2020-04-22 16:05:54 | steve.dower | set | status: open -> closed resolution: fixed messages:
+ msg367014
stage: patch review -> resolved |
2020-04-22 16:05:00 | miss-islington | set | pull_requests:
+ pull_request18982 |
2020-04-22 16:04:53 | steve.dower | set | messages:
+ msg367013 |
2020-04-22 10:43:38 | steve.dower | set | pull_requests:
+ pull_request18978 |
2020-04-13 22:30:35 | vstinner | set | nosy:
- vstinner
|
2020-04-13 19:35:07 | eryksun | set | messages:
+ msg366327 |
2020-04-13 19:02:14 | steve.dower | set | messages:
+ msg366324 |
2020-04-13 18:44:55 | eryksun | set | messages:
+ msg366322 |
2020-04-13 18:07:12 | steve.dower | set | messages:
+ msg366320 |
2020-04-13 17:41:34 | steve.dower | set | messages:
+ msg366319 |
2020-04-13 14:47:03 | steve.dower | set | messages:
+ msg366316 |
2020-04-13 12:02:10 | eryksun | set | messages:
+ msg366307 |
2020-04-13 11:39:24 | steve.dower | set | pull_requests:
+ pull_request18851 |
2020-04-09 12:31:56 | miss-islington | set | messages:
+ msg366045 |
2020-04-09 12:27:15 | vstinner | set | nosy:
+ vstinner
messages:
+ msg366044 versions:
+ Python 3.8 |
2020-04-09 12:13:27 | miss-islington | set | nosy:
+ miss-islington
pull_requests:
+ pull_request18804 stage: needs patch -> patch review |
2020-04-08 20:03:27 | aeros | set | messages:
+ msg366003 |
2020-04-08 18:31:23 | steve.dower | set | messages:
+ msg365999 |
2020-04-08 16:14:02 | zach.ware | set | messages:
+ msg365993 |
2020-04-08 04:28:54 | xtreak | set | nosy:
+ xtreak messages:
+ msg365971
|
2020-04-08 04:05:49 | aeros | set | messages:
+ msg365970 |
2020-04-07 22:34:32 | steve.dower | set | messages:
+ msg365955 |
2020-04-07 22:16:57 | steve.dower | set | messages:
+ msg365954 |
2020-04-07 22:14:47 | steve.dower | set | messages:
+ msg365953 |
2020-04-07 17:03:29 | zach.ware | set | messages:
+ msg365919 stage: patch review -> needs patch |
2020-04-07 16:30:03 | zach.ware | set | stage: needs patch -> patch review pull_requests:
+ pull_request18773 |
2020-04-07 15:42:56 | eryksun | set | nosy:
+ eryksun messages:
+ msg365910
|
2020-04-07 10:29:24 | steve.dower | set | messages:
+ msg365902 |
2020-04-07 10:04:51 | aeros | set | messages:
+ msg365900 |
2020-04-07 06:40:23 | zach.ware | set | stage: patch review -> needs patch |
2020-04-07 06:40:04 | zach.ware | set | messages:
+ msg365892 |
2020-04-07 06:09:53 | zach.ware | set | type: behavior stage: patch review |
2020-04-07 06:09:28 | zach.ware | set | nosy:
+ paul.moore, tim.golden, steve.dower messages:
+ msg365890
components:
+ Tests, Windows, ctypes, - Library (Lib) keywords:
- 3.9regression stage: patch review -> (no value) |
2020-04-07 06:07:37 | zach.ware | set | keywords:
+ patch nosy:
+ zach.ware
pull_requests:
+ pull_request18768 stage: patch review |
2020-04-07 03:27:46 | aeros | create | |