classification
Title: Make python slightly more relocatable
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, mathias, ncoghlan, ronaldoussoren, vstinner
Priority: normal Keywords: patch

Created on 2013-06-26 16:00 by mathias, last changed 2019-10-02 05:47 by mathias.

Files
File name Uploaded Description Edit
python-relative-path-lookup.diff mathias, 2013-06-26 16:00 Proposed change review
python-relative-path-lookup-v2.diff mathias, 2013-07-04 07:56 review
Messages (12)
msg191909 - (view) Author: Mathias Fröhlich (mathias) Date: 2013-06-26 16:00
Hi all,

I want to move python a bit closer to be relocatable.
One problem to solve is where python finds its modules.
The usual lookup mechanism is to compile in a configure time
determined prefix that is used as a last resort path if the
paths are not set otherwise during application/interpreter startup.
The most commonly known way to change the module path at startup time
are probably the environment variables PYTHONPATH and PYTHONHOME.
The python interpreter itself already tries to interpret argv[0] to get to this point, but it would be nice if an application embedded interpreter also finds its module path without providing this argv[0] directly to the python library. This should even work if being moved or being installed at a different path than the configure time prefix path.

The proposal is to add an additional attempt to find the python modules
just before we resort to the compiled in prefix by looking at the path
to the python27.{so,dll}. Relative to this shared object python library
file the python modules are searched in the usual way. If there are
no python modules found relative to the python library file, the very
last resort compiled in prefix is used as usual.

For architectures where we cannot determine the path of the shared
library file, nothing changes.

I have attached a patch that tries to implement this.
It should serve as a base for discussions.
This change is tested on linux and behaves like expected. The windows code for this is copied over from an other project where I have this actively running. But this python code variant is not even compile tested on windows.

thanks in advance

Mathias
msg191929 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-06-27 04:34
Hi Mathias.  There is a current proposal (http://www.python.org/dev/peps/pep-0432/) for improving interpreter startup.  So changes in this area are subject to extra caution.  The changes you are talking about are at least indirectly impacted by the proposal, though I expect they are more directly tied to what happens in site.py.

As to your proposal, aren't the embedding needs already addressed?  See http://docs.python.org/2/c-api/intro.html#embedding-python.  Is there some convention for keeping the site files adjacent to the SO/DLL that would warrant your proposed code?

p.s. this would be a new feature so it only applies to Python 3.4.
msg191935 - (view) Author: Mathias Fröhlich (mathias) Date: 2013-06-27 09:17
Hi Eric,

Thanks for looking at that ticket so fast!

Reassigning this to 3.4 is great.

In general, yes I can already do what I need more or less. This is the reason why I can be fine with about every python version.

The point I bring up this change that I believe I am doing this at an unappropriate place as I need to know some internals of python when I do so and that I think that other can probably also benefit from this idea/change.
What I currently do is to write an application that just uses python*.so as an embedded interpreter and this precompiled application might be relocated to about everywhere - just where it is unpacked.
We are currently using the same sort of code to find out where the python*so file is and we use Py_SetPythonHome to set is to the directory where the so file resides.

Why are we doing this?
So, it takes the idea that is currently in the standard python interpreter. This one tries to be relocatable (means: pack the installation directory and unpack that somewhere else and be still able to run) by looking at argv[0] and dereferencing symbolic links until it arrives at a real file.
Now suppose you want to embed python, then you do no longer use the standard python interpreter program. You may also use a different installation layout for basic things like bin and lib. So you end up with an application that is no longer able to find its provided python modules by looking at the applications path.
But instead of starting from the path of the interpreter (which is not used in this case) or the application itself you could start from the python library path and look for your python installation relative to that. So as long as you stick with the relative file layout of everything that is python related (and only what is python related, the python.so and the modules) when you pack and unpack your precompiled application this would just work.

So, put that in short:
Instead of dynamically finding the the python module path relative to .../bin/python try to find the python relative to .../lib/libpython34.so.
The benefit of that would be that every application that embeds python and needs to be relocatable will just work in the way that today only the standard python interpreter works.

I try to get all of the PEP you pointed me to.
As I am seeing this longer document the first time, I am not sure if I missed something there, but in that framework of this my proposal would probably influence the initial setting of 

sys.prefix (?)

if this is not already provided from the embedding application.

And yes I am perfectly fine with a different or more general approach.
The initially attached patch is something that tried to integrate into the current checked in code as I understood it.

Greetings

Mathias
msg191944 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-06-27 11:42
The way we figure out where to find the standard library is crazy, and creating the infrastructure to start making it less crazy is actually one of the prime motivations for PEP 432 :)
msg191994 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2013-06-28 14:52
Note that the OSX port already does this for framework builds. I don't know why we don't use the same code for shared library builds.

Issue #15498 contains a patch that switches this code from a deprecated nextstep-era API to dladdr.

Two comments on the patch attached to this issue:

1) The name "_PyImport_GetModulePath" is confusing, I'd use _PyImport_GetSharedLibPath to make clear that this is locating the shared library.

2) The code calls dladdr on a static variable that's introduced just for that, it is also possible to call dladdr on an already existing symbol (for example the address of a function in the public API).
msg192277 - (view) Author: Mathias Fröhlich (mathias) Date: 2013-07-04 07:56
Hi Ronald, Eric, Nick,

Looking up the symbol name of the current function should work also.
And I am free to rename these functions to whatever you like.

Attached is version 2 of the patch with the suggested changes.
The windows implementation is still untested.

It would be interesting to know if this kind of lookup scheme can be included into PEP 432.
Provided the spirit of this PEP, I can imagine to provide several functions to build up the pythonpath starting from something.
So say, have a 'get python path from argv[0]', a 'get python path from shared python library' and a 'get python path from prefix' function (I may miss a variant). Also a 'build python path from python home root entry point' function would be useful and could be used by the above functions.
An application embedding python can then call those functions that are sensible for its own use and installation scheme to set the module path in the PyConfig struct. The Py_ReadConfig function will internally use the above suggested functions to build up the default configuration if not already provided.

Greetings

Mathias
msg353243 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-26 00:47
The PEP 587 "Python Initialization Configuration" has been implemented in Python 3.8. It provides fine control on the "Path Configuration":

* https://docs.python.org/dev/c-api/init_config.html
* https://docs.python.org/dev/c-api/init_config.html#init-path-config
msg353267 - (view) Author: Mathias Fröhlich (mathias) Date: 2019-09-26 06:31
Hi,

Nice to see some progress.
Still, I checked todays https://github.com/python/cpython.git master and 3.8 branch (is that the current cpython development code?). Neither of them contain a call to dladdr beside the macos code path mentioned in msg191994 by Ronald Oussoren which does this already for a long time.
By the lack of dladdr, I conclude that the code idea of my request here is not solved.

May be to rephrase that. The basic idea behind that request was to make
pythons default way to setup the paths required to find the python modules based on the place where the python library resides instead of the python executable program. I do not mean the compile time prefix but the actual location of the shared object in the file system.
That would help to build applications that embed cpython, ship and unpack the whole application tree including the python modules to a custom location, while still preserving the subtree structure containing the python shared library and the python modules, not known at compile time. Note that this patch contained code to make that work from within python without custom code in the embedding application. Doing that on the embedding and calling application side was always possible and still is possible - but that was not the point.

best

Mathias
msg353276 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-26 07:29
Hum, I am confused. I understood that this issue is able customizing sys.path when Python is embedded. But it seems like the feature request is more about the *default* implementation, not how to reimplement it outside Python (with custm code).
msg353288 - (view) Author: Mathias Fröhlich (mathias) Date: 2019-09-26 10:11
Yes.

msg191944 from Nick Coghlan, made me think that with all the initialization rework that appeared to be underway you want to incorporate that presented idea of basing the default onto the location of the libpython.so or the pythonX.X.dll instead of the location of python/python.exe.
And as mentioned by Ronald Oussoren that would even align the methods used across the architectures to something common with a fallback to the current way that takes the path of the interpreter executable.
At least that is what the provided patch implemented in the old code structure.

And this does not even change the default for the common case where the default is plain useful. It is just changing the way how the default is determined so that the default for the case of an embedded interpreter is more meaningful.

As stated somewhere above. The you can do that with application code when setting up the embedded interpreter, but it would be nice if that just works out of the box and by that helps applications not thinking of that solution.

best

Mathias
msg353311 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-09-26 14:47
My plan is not to change the default implementation to calculate the path configuration, but make it easier to customize the path configuration.

One idea is to rewrite Modules/getpath.c and PC/getpathp.c in Python and convert it to a frozen module. It is easier to modify Python code than C code. In the past, we already did such change for importlib (which also has a frozen part, importlib._bootstrap and importlib._bootstrap_external).

The PEP 587 implementation moves towards that with the "Multi-Phase Initialization Private Provisional API":
https://docs.python.org/dev/c-api/init_config.html#multi-phase-initialization-private-provisional-api
msg353720 - (view) Author: Mathias Fröhlich (mathias) Date: 2019-10-02 05:47
Ok, so far.
But what shall I do now?
It would be nice that python is a bit smarter in finding its increasing important module files when being embedded into an application.
Anybody out there who wants to look at that contribution?
best
Mathias
History
Date User Action Args
2019-10-02 05:47:38mathiassetmessages: + msg353720
2019-09-26 14:47:00vstinnersetmessages: + msg353311
2019-09-26 10:11:35mathiassetmessages: + msg353288
2019-09-26 07:29:21vstinnersetmessages: + msg353276
versions: + Python 3.9, - Python 3.8
2019-09-26 06:31:15mathiassetstatus: closed -> open
resolution: fixed ->
messages: + msg353267
2019-09-26 00:47:29vstinnersetstatus: open -> closed

versions: + Python 3.8, - Python 3.4
nosy: + vstinner

messages: + msg353243
resolution: fixed
stage: resolved
2013-07-04 07:56:08mathiassetfiles: + python-relative-path-lookup-v2.diff

messages: + msg192277
2013-06-28 14:52:47ronaldoussorensetnosy: + ronaldoussoren
messages: + msg191994
2013-06-27 11:42:00ncoghlansetmessages: + msg191944
2013-06-27 09:17:32mathiassetmessages: + msg191935
2013-06-27 04:34:44eric.snowsetversions: - Python 2.6, Python 3.1, Python 2.7, Python 3.2, Python 3.3, Python 3.5
nosy: + ncoghlan, eric.snow

messages: + msg191929

components: + Interpreter Core, - Installation
2013-06-26 16:00:49mathiascreate