classification
Title: [C API] Add PyInterpreterState_SetConfig(): reconfigure an interpreter
Type: Stage: patch review
Components: C API Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: cmeyer, ncoghlan, serhiy.storchaka, shihai1991, vstinner
Priority: normal Keywords: patch

Created on 2020-11-04 14:45 by vstinner, last changed 2020-11-24 23:41 by vstinner.

Pull Requests
URL Status Linked Edit
PR 23149 merged vstinner, 2020-11-04 14:52
PR 23150 merged vstinner, 2020-11-04 15:16
PR 23158 merged vstinner, 2020-11-04 23:13
PR 23167 merged vstinner, 2020-11-05 15:56
PR 23168 merged vstinner, 2020-11-05 17:33
PR 23169 open vstinner, 2020-11-05 18:49
PR 23211 merged vstinner, 2020-11-09 23:55
PR 23220 merged vstinner, 2020-11-10 13:28
PR 23249 merged vstinner, 2020-11-12 13:54
PR 23488 merged serhiy.storchaka, 2020-11-24 08:36
Messages (19)
msg380327 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-04 14:45
This issue is a follow-up of the PEP 567 which introduced the PyConfig C API and is related to PEP 432 which wants to rewrite Modules/getpath.c in Python.

I would like to add a new PyInterpreterState_SetConfig() function to be able to reconfigure a Python interpreter in C. One example is to write a custom sys.path, to implement of virtual environment (common request for embedded Python), etc. Currently, it's really complex to tune the Python configuration.

The use case is to tune Python for embedded Python. First, I would like to add new functions to the C API for that:

* PyInterpreterState_GetConfigCopy()
* PyInterpreterState_SetConfig()

The second step will to be expose these two functions in Python (I'm not sure where for now), and gives the ablity to tune the Python configuration in pure Python.

The site module already does that for sys.path, but it is running "too late" in the Python initialization. Here the idea is to configure Python before it does access any file on disk, after the "core" initialization and before the "main" initialization.

One concrete example would be to reimplement Modules/getpath.c in Python, convert it to a frozen module, and run it at Python startup to populate sys.path. It would allow to move some of the site code into this module to run it earlier.

Pseudo-code in C:
---------------------
void init_core(void)
{
  // "Core" initialization
  PyConfig config;
  PyConfig_InitPython(&config);
  PyConfig._init_main = 0
  Py_InitializeFromc(&config);
  PyConfig_Clear(&config);
}

void tune_config(void)
{
  PyConfig config;
  PyConfig_InitPython(&config);

  // Get a copy of the current configuration
  PyInterpreterState_GetConfigCopy(&config);  // <== NEW API!

  // ... put your code to tune config ...

  // dummy example, current not possible in Python
  config.bytes_warnings = 1;

  // Reconfigure Python with the updated configuration
  PyInterpreterState_SetConfig(&config);  // <=== NEW API!
  PyConfig_Clear(&config);
}
  
int main()
{
  init_core();
  tune_config(); // <=== THE USE CASE!
  _Py_InitializeMain();
  return Py_RunMain();
}
---------------------

In this example, tune_config() is implemented in C. But later, it will be possible to convert the configuration to a Python dict and run Python code to tune the configuration.

The PEP 587 added a "Multi-Phase Initialization Private Provisional API":

* PyConfig._init_main = 0
* _Py_InitializeMain()

https://docs.python.org/dev/c-api/init_config.html#multi-phase-initialization-private-provisional-api
msg380328 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-04 15:15
New changeset cfb41e80c1ac5940ec6f2246c9ab4a3d16ef757e by Victor Stinner in branch 'master':
bpo-42260: Reorganize PyConfig (GH-23149)
https://github.com/python/cpython/commit/cfb41e80c1ac5940ec6f2246c9ab4a3d16ef757e
msg380341 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-04 16:34
New changeset af1d64d9f7a7cf673279725fdbaf4adcca51d41f by Victor Stinner in branch 'master':
bpo-42260: Main init modify sys.flags in-place (GH-23150)
https://github.com/python/cpython/commit/af1d64d9f7a7cf673279725fdbaf4adcca51d41f
msg380382 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-04 23:46
New changeset 048a35659aa8074afe7d7d054e7cea1f8ee6d366 by Victor Stinner in branch 'master':
bpo-42260: Add _PyInterpreterState_SetConfig() (GH-23158)
https://github.com/python/cpython/commit/048a35659aa8074afe7d7d054e7cea1f8ee6d366
msg380421 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-05 17:12
New changeset f3cb81431574453aac3b6dcadb3120331e6a8f1c by Victor Stinner in branch 'master':
bpo-42260: Add _PyConfig_FromDict() (GH-23167)
https://github.com/python/cpython/commit/f3cb81431574453aac3b6dcadb3120331e6a8f1c
msg380424 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-05 17:58
New changeset dc42af8fd16b10127ce1fc93c13bc1bfd2674aa2 by Victor Stinner in branch 'master':
bpo-42260: PyConfig_Read() only parses argv once (GH-23168)
https://github.com/python/cpython/commit/dc42af8fd16b10127ce1fc93c13bc1bfd2674aa2
msg380657 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-10 12:22
New changeset 9e1b828265e6bfb58f1e0299bd78d8ff6347a2ba by Victor Stinner in branch 'master':
bpo-42260: Compute the path config in the main init (GH-23211)
https://github.com/python/cpython/commit/9e1b828265e6bfb58f1e0299bd78d8ff6347a2ba
msg380658 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-10 12:43
If we remove Modules/getpath.c, it will no longer be possible to automatically computes the path configuration when one of the following getter function will be called:

* Py_GetPath()
* Py_GetPrefix()
* Py_GetExecPrefix()
* Py_GetProgramFullPath()
* Py_GetPythonHome()
* Py_GetProgramName()

It means that these functions would not return NULL if called before Python is initialiazed, but return the expected string once Python is initialized.

Moreover, Py_SetPath() would no longer automatically computes the "program full path" (sys.executable).
msg380659 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-10 12:48
> If we remove Modules/getpath.c, it will no longer be possible to automatically computes the path configuration when one of the following getter function will be called: (...)

It is not really an incompatible change according to the documentation:

"Note: The following functions should not be called before Py_Initialize(): Py_EncodeLocale(), Py_GetPath(), Py_GetPrefix(), Py_GetExecPrefix(), Py_GetProgramFullPath(), Py_GetPythonHome(), Py_GetProgramName() and PyEval_InitThreads().".

https://docs.python.org/dev/c-api/init.html
msg380707 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-10 20:10
New changeset ace3f9a0ce7b9fe8ae757fdd614f1e7a171f92b0 by Victor Stinner in branch 'master':
bpo-42260: Fix _PyConfig_Read() if compute_path_config=0 (GH-23220)
https://github.com/python/cpython/commit/ace3f9a0ce7b9fe8ae757fdd614f1e7a171f92b0
msg380728 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-10 23:15
The main drawback of rewriting Modules/getpath.c as Lib/_getpath.py (and removing getpath.c) is that PyConfig_Read() could no longer compute the Python Path Configuration. It would return an "empty" path configuration.
msg380773 - (view) Author: Chris Meyer (cmeyer) * Date: 2020-11-11 16:52
Responding to your request for feedback on Python-Dev:

We embed Python dynamically by finding the libPython DLL, loading it, and looking up the required symbols. We make appropriate define's so that the Python headers (and NumPy headers) point to our functions which in turn point to the looked up symbols.

Our launcher works on Linux, macOS, and Windows and works with many environments including standard Python and conda and brew. It also supports virtual environments in most cases. Also, a single executable [per platform] is able to work with Python versions 3.7 - 3.9 (3.6 was recently dropped, but only for external reasons).

So my comment is not directly addressing the usefulness of configuring Python initialization - but I would like to request that this ability to dynamically load Python DLLs remains even with any new initialization mechanism.

As another note, the main issues we run into are configuring the Python path to properly find packages and DLLs. A goal of ours is to be able to provide the base application as a drag-and-drop style installer with its own full embedded Python distribution (but still loaded dynamically) and then be able to supply additional plug-in packages (Python packages) by drag and drop. This is somewhat similar to conda packaging but without support for command line tools.
msg380775 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-11 17:07
> I would like to request that this ability to dynamically load Python DLLs remains even with any new initialization mechanism.

I don't plan to remove any feature :-)

> As another note, the main issues we run into are configuring the Python path to properly find packages and DLLs.

Do you mean sys.path? If yes, that's one of the goal of this issue. Allow you to write your own Python code to configure sys.path, rather than having to write C code, before the first (external) import.

How do you configure sys.path currently? Do you parse a configuration file? Do you use a registry key on Windows?
msg380780 - (view) Author: Chris Meyer (cmeyer) * Date: 2020-11-11 17:46
> How do you configure sys.path currently? Do you parse a configuration file? Do you use a registry key on Windows?

We have several launch scenarios - but for the currently most common one, which is to launch using a separate, existing Python environment, we call Py_SetPythonHome and Py_SetPath with the home directory of the environment. Then, presumably, the more complete path gets set in either Py_Initialize or when we call PyImport_ImportModule(“sys”). I might have tracked the details down once, but I don't recall them. By the time our Python code starts running, sys.path is reasonably populated.

However, in another scenario, we launch with an embedded Python environment, essentially a virtual environment. In that case, we have a config file to explicitly add lib, DLLs, and site packages. But something goes wrong [cannot find/load the unicode DLL IIRC] unless we call site.addsitedir for each directory already in sys.path near the start of our Python portion of code. My notes point to two issues to explain this: https://bugs.python.org/issue22213 and https://bugs.python.org/issue35706.
msg380781 - (view) Author: Chris Meyer (cmeyer) * Date: 2020-11-11 17:54
>> I would like to request that this ability to dynamically load Python DLLs remains even with any new initialization mechanism.

> I don't plan to remove any feature :-)

I am glad to hear that. I'm somewhat nervous about it nevertheless. In particular, the implementation of Py_DECREF changed from 3.7 to 3.8 to 3.9. 3.7 worked entirely in a header; but 3.8 had a quirky definition of _Py_Dealloc which used _Py_Dealloc_inline but was defined out of order (used before defined). This was somewhat addressed in https://github.com/python/cpython/pull/18361/files; however 3.9 now has another mechanism that defines _Py_Dealloc in Objects/object.c. This isn't a major problem because it has the same implementation as before, but changes like this have the potential to make the launcher binary be version specific. Again, not a deal breaker, but it still makes me nervous.
msg380823 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-12 14:14
New changeset ef75a625cdf8377d687a04948b4db9bc1917bf19 by Victor Stinner in branch 'master':
bpo-42260: Initialize time and warnings earlier at startup (GH-23249)
https://github.com/python/cpython/commit/ef75a625cdf8377d687a04948b4db9bc1917bf19
msg381711 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-11-24 07:48
Please don't use PyDict_GetItemString(), it will be deprecated. You can use _PyDict_GetItemStringWithError().

Also always check the raised exception type before overwriting the exception, so you will not swallow MemoryError or other unexpected error.
msg381718 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-11-24 12:07
New changeset 14d81dcaf827f6b66bda45e8f5689d07d7d5735c by Serhiy Storchaka in branch 'master':
bpo-42260: Improve error handling in _PyConfig_FromDict (GH-23488)
https://github.com/python/cpython/commit/14d81dcaf827f6b66bda45e8f5689d07d7d5735c
msg381782 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-11-24 23:41
I opened a thread on python-dev about this issue:
"Configure Python initialization (PyConfig) in Python"
https://mail.python.org/archives/list/python-dev@python.org/thread/HQNFTXOCDD5ROIQTDXPVMA74LMCDZUKH/#X45X2K4PICTDJQYK3YPRPR22IGT2CDXB
History
Date User Action Args
2020-11-24 23:41:05vstinnersetmessages: + msg381782
2020-11-24 12:07:35serhiy.storchakasetmessages: + msg381718
2020-11-24 08:36:10serhiy.storchakasetpull_requests: + pull_request22376
2020-11-24 07:48:35serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg381711
2020-11-14 13:18:05ncoghlansetnosy: + ncoghlan
2020-11-12 14:14:16vstinnersetmessages: + msg380823
2020-11-12 13:54:47vstinnersetpull_requests: + pull_request22146
2020-11-11 17:54:03cmeyersetmessages: + msg380781
2020-11-11 17:46:43cmeyersetmessages: + msg380780
2020-11-11 17:07:22vstinnersetmessages: + msg380775
2020-11-11 16:52:25cmeyersetnosy: + cmeyer
messages: + msg380773
2020-11-10 23:15:27vstinnersetmessages: + msg380728
2020-11-10 20:10:31vstinnersetmessages: + msg380707
2020-11-10 13:28:31vstinnersetpull_requests: + pull_request22118
2020-11-10 12:48:45vstinnersetmessages: + msg380659
2020-11-10 12:43:52vstinnersetmessages: + msg380658
2020-11-10 12:22:06vstinnersetmessages: + msg380657
2020-11-09 23:55:54vstinnersetpull_requests: + pull_request22109
2020-11-05 18:49:11vstinnersetpull_requests: + pull_request22081
2020-11-05 17:58:39shihai1991setnosy: + shihai1991
2020-11-05 17:58:14vstinnersetmessages: + msg380424
2020-11-05 17:33:36vstinnersetpull_requests: + pull_request22080
2020-11-05 17:12:41vstinnersetmessages: + msg380421
2020-11-05 15:56:28vstinnersetpull_requests: + pull_request22079
2020-11-04 23:46:04vstinnersetmessages: + msg380382
2020-11-04 23:13:58vstinnersetpull_requests: + pull_request22069
2020-11-04 16:34:40vstinnersetmessages: + msg380341
2020-11-04 15:16:02vstinnersetpull_requests: + pull_request22062
2020-11-04 15:15:57vstinnersetmessages: + msg380328
2020-11-04 14:52:03vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request22061
2020-11-04 14:45:21vstinnercreate