Issue34725
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2018-09-18 17:26 by mariofutire, last changed 2022-04-11 14:59 by admin.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
poc.c | mariofutire, 2018-09-18 17:26 | Example of odd behaviour |
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 9860 | merged | steve.dower, 2018-10-13 23:57 | |
PR 9861 | merged | steve.dower, 2018-10-13 23:58 |
Messages (23) | |||
---|---|---|---|
msg325666 - (view) | Author: Mario (mariofutire) | Date: 2018-09-18 17:26 | |
According to the doc Py_GetProgramFullPath() should return the full path of the program name as set by Py_SetProgramName(). https://docs.python.org/3/c-api/init.html#c.Py_GetProgramFullPath This works well in Linux, but in Windows it is always the name of the current executable (from GetModuleFileNameW). This is because the 2 files Modules/getpath.c and PC/getpathp.c have completely different logic in calculate_program_full_path() vs get_program_full_path(). This difference is harmless when running in the normal interpreter (python.exe), but can be quite dramatic when embedding python into a C application. The value returned by Py_GetProgramFullPath() is the same as sys.executable in python. Why this matters? For instance in Linux virtual environments work out of the box for embedded applications, while they are completely ignored in Windows. python -m venv abcd and then if I run my app inside the (activated) abcd environment in Linux I can access the same modules as if I were executing python, while in Windows I still get the system module search path. If you execute the attached program in Linux you get EXECUTABLE /tmp/abcd/bin/python3 PATH ['/usr/lib/python37.zip', '/usr/lib/python3.7', '/usr/lib/python3.7/lib-dynload', '/tmp/abcd/lib/python3.7/site-packages'] in Windows EXECUTABLE c:\TEMP\vsprojects\ConsoleApplication1\x64\Release\ConsoleApplication1.exe PATH ['C:\\TEMP\\venv\\abcd\\Scripts\\python37.zip', 'C:\\Python37\\Lib', 'C:\\Python37\\DLLs', 'c:\\TEMP\\vsprojects\\ConsoleApplication1\\x64\\Relea se', 'C:\\Python37', 'C:\\Python37\\lib\\site-packages'] with a mixture of paths from the venv, system and my app folder. But more importantly site-packages comes from the system (bad!). This is because site.py at lines 454 uses the path of the interpreter to locate the venv configuration file. So in the end, virtual environments work out of the box in Linux even for an embedded python, but not in Windows. |
|||
msg325668 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-09-18 18:24 | |
That executable doesn't appear to be in a virtual environment - you should be running C:\TEMP\venv\abcd\Scripts\python.exe Does that resolve your problem? |
|||
msg325669 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-09-18 18:26 | |
(Also, the behavior of Py_GetProgramFullPath is intentional, but we do have another bug somewhere to be able to override it for embedding purposes. sys.executable should be None when it does not contain a suitable path for running the normal Python interpreter again. I haven't searched for that bug just now, but we should find it and track the issue there, rather than creating a different issue.) |
|||
msg325674 - (view) | Author: Mario (mariofutire) | Date: 2018-09-18 19:42 | |
On 18/09/2018 19:24, Steve Dower wrote: > > Steve Dower <steve.dower@python.org> added the comment: > > That executable doesn't appear to be in a virtual environment - you should be running C:\TEMP\venv\abcd\Scripts\python.exe > > Does that resolve your problem? > Nope, I am *not* running python, I am running a C app which embeds the python interpreter. I am running exactly c:\TEMP\vsprojects\ConsoleApplication1\x64\Release\ConsoleApplication1.exe In a later comment you say the behaviour of Py_GetProgramFullPath is intentional: which behaviour? Windows? Linux? or the fact that they behave differently? I guess that if there were a way to force Py_GetProgramFullPath() it would solve my problem, because I could direct site.py towards the correct virtual environment. If sys.executable becomes None for embedded python (without the ability to set it), then virtual environments wont work at all, which would be sad. |
|||
msg326035 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-09-21 20:44 | |
I meant returning the full name of the process is intentional. But you're right that overriding it should actually override it. I found the prior bug at issue33180, but I'm closing it in favour of this one. I don't have fully fleshed out semantics in my mind for all the cases to handle here, but I hope that we soon reach a point of drastically simplifying getpath and can align the platforms better at that point. Meanwhile I'll leave this open in case anyone wants to work on a targeted fix. |
|||
msg326042 - (view) | Author: Mario (mariofutire) | Date: 2018-09-21 21:02 | |
On 21/09/2018 21:44, Steve Dower wrote: > > Steve Dower <steve.dower@python.org> added the comment: > > I meant returning the full name of the process is intentional. But you're right that overriding it should actually override it. > > I found the prior bug at issue33180, but I'm closing it in favour of this one. I don't have fully fleshed out semantics in my mind for all the cases to handle here, but I hope that we soon reach a point of drastically simplifying getpath and can align the platforms better at that point. > > Meanwhile I'll leave this open in case anyone wants to work on a targeted fix. > So you are saying that the Windows behaviour (+ ability to overwrite) is intentional. This looks to me in contrast to what the doc says under https://docs.python.org/3/c-api/init.html#c.Py_GetProgramFullPath. Moreover I am not sure what Py_SetProgramName() is meant to do then. The problem in my opinion is that we are trying to fit 2 things in the same field: the real executable name and the root of the python installation (which could be a virtual environment as well). In python.exe the 2 are the same (or linked), but for embedded applications they are not. Remember that site.py uses the sys.executable as "root of the python installation" to derive the path and handle virtual environments. I think that if these 2 concepts were separated, it would be much easier to explain the desired behaviour and find a valid implementation in Window and Linux. Let's say sys.executable is the full name of the process and sys.python_root is the folder from which to derive all the paths. It is probably too big of a change, but it might be useful to write down the ideal behaviour before thinking of a pragmatic solution. Andrea |
|||
msg326970 - (view) | Author: Mario (mariofutire) | Date: 2018-10-03 14:05 | |
Is there any agreement on what is wrong with the current code. The key in my opinion is the double purpose of sys.executable and that in Linux and Windows people have taken the two different points of view, so they are both right and wrong at the same time. |
|||
msg327000 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-10-03 18:26 | |
I don't think anything has been agreed upon. Currently, the launched program name is used for some things other than setting sys.executable, and I believe it should continue to be used for those. But there are also needs for overriding sys.executable to be something other than the current process (e.g. a launcher that simply loads Python into its own process, but needs a different process to be used for multiprocessing support). Victor has been looking at the initialization process, so I'm not sure if something has already changed here yet. I'd be keen to see the getpath part of initialization be written in (frozen or limited) Python code that can be easily overridden by embedders to initialize all of these members however they like. That way everyone can equally lie about argv0/GetModuleFullPath and sys.prefix/sys.executable/etc. Until we get there, we may just need a couple more configuration fields, and perhaps some that default to one of the others when unspecified. |
|||
msg327081 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-10-04 20:06 | |
Reading the docs, I'm pretty sure we need a new Py_SetProgramFullPath() function. Py_SetProgramName explicitly is only providing a hint to figure out the file containing the executable, and I really want this to make my new launcher feasible: https://github.com/zooba/cpython/blob/msix/Programs/launch.c Victor - I've tried for an hour now and I can't figure out where to put this value in all the new configuration stuff. I'm finding it *very* convoluted, with so much copying of config structs and then back-and-forth copying certain values around. Some guidance would be great. |
|||
msg327249 - (view) | Author: Alyssa Coghlan (ncoghlan) * | Date: 2018-10-06 14:42 | |
Directly addressing the topic of the bug: Py_SetProgramName() should be a relative or absolute path that can be used to set sys.executable and other values appropriately. This is used in Programs/_testembed.c for example. I didn't know it didn't work the same way on Windows as it does on other platforms, and I have no idea why it's different there. (The divergence between the Windows and *nix implementations of getpath predates my own involvement in startup sequence modifications, and I've never even read the Windows version of the code) On the startup sequence refactoring in general: Yeah, eventually being able to eliminate getpath.c in favour of a froze _getpath.py module has been one of my long term hopes for the PEP 432 startup sequence refactoring. The underlying issue making that difficult that is that it's always been murky as to exactly what Python code could safely execute at the point where that path information needs to be calculated, and the tests of path configuration are weak enough that it's easy to introduce regressions even with small changes, let alone a wholesale rewrite. If a new setting is genuinely needed, then where to put things in the new config is still open for discussion - at the moment, it's pretty much just a straight transcription of the way CPython has historically done things, and is hence heavy on the use of low level C data types (especially wchar* where paths are concerned). This means that the CoreConfig struct currently still contains a lot of things that aren't actually needed if all you want is a running Python interpreter and can live without a fully populated sys module. The *advantage* of that approach is that it means it still maps pretty easily to the existing Py_Initialize approach: the PySet_* API writes to a global copy of a the CoreConfig struct, and then Py_Initialize reads that in to the active runtime state. |
|||
msg327364 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-10-08 16:54 | |
> Py_SetProgramName() should be a relative or absolute path that can be used to set sys.executable and other values appropriately. Key point here is *can be*, but it doesn't have to be. Given it has fallbacks all the way to "python"/"python3", we can't realistically use it as sys.executable just because it has a value. And right now, it's used to locate the current executable (which is unnecessary on Windows), which is then assumed to be correct for sys.executable. Most embedding cases require *this* assumption to be overridden, not the previous assumption. |
|||
msg327370 - (view) | Author: Mario (mariofutire) | Date: 2018-10-08 20:19 | |
On 08/10/2018 17:54, Steve Dower wrote: > > Steve Dower <steve.dower@python.org> added the comment: > >> Py_SetProgramName() should be a relative or absolute path that can be used to set sys.executable and other values appropriately. > > Key point here is *can be*, but it doesn't have to be. Given it has fallbacks all the way to "python"/"python3", we can't realistically use it as sys.executable just because it has a value. > > And right now, it's used to locate the current executable (which is unnecessary on Windows), which is then assumed to be correct for sys.executable. Most embedding cases require *this* assumption to be overridden, not the previous assumption. I still would like my use case to be acknowledged. site.py uses the value of sys.executable to set up a virtual environment, which is a very valuable thing even in an embedded cases. This constraint is strong enough to force it to point to python.exe or python3 as it would normally do in a scripted (non embedded case). I still believe the 2 concepts should be decoupled to avoid them clashing and having supporters of one disagreeing with supporters of the other. Andrea |
|||
msg327447 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-10-10 00:11 | |
We'll need to bring in venv specialists to check whether using it outside of Py_Main() is valid. Or perhaps you could explain what you are actually trying to do? I don't believe it is necessary when you are calling Py_SetPath yourself, and only the "launch normally with alternate args" case for scripts that use sys.executable are affected. But I'm happy to be set right here (with example scenarios, preferably). |
|||
msg327489 - (view) | Author: Mario (mariofutire) | Date: 2018-10-10 19:28 | |
On 10/10/2018 01:11, Steve Dower wrote: > > Steve Dower <steve.dower@python.org> added the comment: > > We'll need to bring in venv specialists to check whether using it outside of Py_Main() is valid. Or perhaps you could explain what you are actually trying to do? > Sure 1) Create a virtual environment ("python -m venv") 2) Activate 2) Pip install some modules 3) Try to use them form inside an embedded application (e.g. the one I attached) 4) Do it in Linux and Windows Result Works in Linux, fails in Windows. Reason in site.py https://github.com/python/cpython/blob/73870bfeb9cf350d84ee88bd25430c104b3c6191/Lib/site.py#L462 sys.executable is used to construct the correct search path. Looking at the sys.path from inside an embedded application is very instructive and you can see in the first post why the failure in windows. Andrea |
|||
msg327659 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-10-13 16:37 | |
I meant why are you using an embedded application with a virtual environment? What sort of application do you have that requires users to configure a virtual environment, rather than providing its own set of libraries? The embedding scenarios I'm aware of almost always want privacy/isolation from whatever a user has installed/configured, so that they can work reliably even when users modify other parts of their own system. I'm trying to understand what scenario (other than "I am an interactive Python shell") would want to automatically pick up the configuration rather than having its own configuration files/settings. |
|||
msg327701 - (view) | Author: Mario (mariofutire) | Date: 2018-10-14 10:23 | |
On 13/10/2018 17:37, Steve Dower wrote: > > Steve Dower <steve.dower@python.org> added the comment: > > I meant why are you using an embedded application with a virtual environment? What sort of application do you have that requires users to configure a virtual environment, rather than providing its own set of libraries? > > The embedding scenarios I'm aware of almost always want privacy/isolation from whatever a user has installed/configured, so that they can work reliably even when users modify other parts of their own system. I'm trying to understand what scenario (other than "I am an interactive Python shell") would want to automatically pick up the configuration rather than having its own configuration files/settings. Does it really matter who owns main(), whether it is in python.exe or in some other C app. This is exactly how you described, users want to use some C application which will call into python using some (user defined) python modules to execute some tasks which are scriptable. And they want to be able to do in a confined environment where they can install the exact set of packages they require. And it is possible at the same time to set up multiple environments where different versions are tested independently. There is as well the totally independent scenario where the app ships exactly what it needs, but there are some ways in between where one can script an app and in doing so you might need packages that the app itself knew nothing about. For another example have a look at JEP https://github.com/ninia/jep/search?q=virtual&unscoped_q=virtual This is a way to call python from Java: same problem above, people might want to run it in a virtual environment and the only way to do this now is to manually set up PYTHONHOME, but it is pretty weak and does not replicate exactly what happens with virtual environments (e.g. inherit system's site-packages). Again, in Linux, JEP works out of the box with no need to tell it about virtual environments, Py_Initialise() finds it (if they are indeed present) with absolutely no extra configuration (no need to change PYTHONPATH). Andrea |
|||
msg328054 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-10-19 17:04 | |
I requested Victor review on my PR, but if anyone else is able to please feel free. |
|||
msg330036 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-11-18 04:42 | |
New changeset 177a41a07b7d13c70d068ea0962f07e625ae171e by Steve Dower in branch 'master': bpo-34725: Adds _Py_SetProgramFullPath so embedders may override sys.executable (GH-9860) https://github.com/python/cpython/commit/177a41a07b7d13c70d068ea0962f07e625ae171e |
|||
msg330037 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-11-18 04:42 | |
New changeset e851049e0e045b5e0f9d5c6b8a64d7f6b8ecc9c7 by Steve Dower in branch '3.7': bpo-34725: Adds _Py_SetProgramFullPath so embedders may override sys.executable (GH-9861) https://github.com/python/cpython/commit/e851049e0e045b5e0f9d5c6b8a64d7f6b8ecc9c7 |
|||
msg330038 - (view) | Author: Steve Dower (steve.dower) * | Date: 2018-11-18 04:44 | |
The next releases of 3.7 and 3.8 will include _Py_SetProgramFullPath() functions for embedders to set the eventual value of sys.executable before calling Py_Initialize(). It's undocumented and not guaranteed stable (and indeed, it looks like Victor is already working on another patch that may see it removed before it's ever released), but it's in now as a workaround for the cases that need it. |
|||
msg343644 - (view) | Author: STINNER Victor (vstinner) * | Date: 2019-05-27 15:21 | |
I understand that issue is now fixed in bpo-36763 by the implementation of the PEP 587 which adds a new public API for the "Python Initialization Configuration". It provides a finer API to configure the "Path Configuration". For example, PyConfig.executable can be used to replace _Py_SetProgramFullPath(). In Python 3.7, the private _Py_SetProgramFullPath() function can be used as a workaround. I close the issue. If I misunderstood the issue, please comment/reopen it ;-) |
|||
msg343681 - (view) | Author: Mario (mariofutire) | Date: 2019-05-27 20:02 | |
Unfortunately the underlying cause of this issue has not been addressed, nor discussed. There is now a way to workaround the different behaviour in Windows and Linux and it is possible to use the new call to make virtual environment work in Windows as they already do in Linux. Problem is that application will have to be change to actually implement the workaround. I still think this difference should be addressed directly. |
|||
msg343691 - (view) | Author: STINNER Victor (vstinner) * | Date: 2019-05-27 21:30 | |
I read again the issue. In short, the Path Configuration is a mess and unusable in some cases :-) I reopen the issue. Handling venv shouldn't be handled by the site module which is optional, but earlier. I guess that venv support was added to site because it is way eaiser to write C than touching getpath.c written in C. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:06 | admin | set | github: 78906 |
2019-05-27 21:30:46 | vstinner | set | status: closed -> open resolution: fixed -> messages: + msg343691 |
2019-05-27 20:02:02 | mariofutire | set | messages: + msg343681 |
2019-05-27 15:21:20 | vstinner | set | status: open -> closed resolution: fixed messages: + msg343644 stage: patch review -> resolved |
2018-11-18 04:44:39 | steve.dower | set | messages: + msg330038 |
2018-11-18 04:42:12 | steve.dower | set | messages: + msg330037 |
2018-11-18 04:42:01 | steve.dower | set | messages: + msg330036 |
2018-10-19 17:04:38 | steve.dower | set | messages: + msg328054 |
2018-10-14 10:23:29 | mariofutire | set | messages: + msg327701 |
2018-10-13 23:59:25 | steve.dower | set | versions: + Python 3.8, - Python 3.6 |
2018-10-13 23:58:40 | steve.dower | set | pull_requests: + pull_request9230 |
2018-10-13 23:57:11 | steve.dower | set | keywords:
+ patch stage: patch review pull_requests: + pull_request9229 |
2018-10-13 16:37:16 | steve.dower | set | messages: + msg327659 |
2018-10-10 19:28:14 | mariofutire | set | messages: + msg327489 |
2018-10-10 00:11:29 | steve.dower | set | messages: + msg327447 |
2018-10-08 20:19:22 | mariofutire | set | messages: + msg327370 |
2018-10-08 16:54:26 | steve.dower | set | messages: + msg327364 |
2018-10-06 14:42:17 | ncoghlan | set | messages: + msg327249 |
2018-10-05 17:09:45 | steve.dower | set | nosy:
+ ncoghlan, eric.snow |
2018-10-04 20:06:57 | steve.dower | set | messages: + msg327081 |
2018-10-03 18:26:29 | steve.dower | set | nosy:
+ vstinner messages: + msg327000 |
2018-10-03 14:05:04 | mariofutire | set | messages: + msg326970 |
2018-09-21 21:02:13 | mariofutire | set | messages: + msg326042 |
2018-09-21 20:44:31 | steve.dower | set | messages: + msg326035 |
2018-09-18 19:42:11 | mariofutire | set | messages: + msg325674 |
2018-09-18 18:26:18 | steve.dower | set | messages: + msg325669 |
2018-09-18 18:24:38 | steve.dower | set | messages: + msg325668 |
2018-09-18 17:26:47 | mariofutire | create |