Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C API] Add _PyInterpreterState_SetConfig(): reconfigure an interpreter #86426

Closed
vstinner opened this issue Nov 4, 2020 · 20 comments
Closed
Labels
3.10 only security fixes topic-C-API

Comments

@vstinner
Copy link
Member

vstinner commented Nov 4, 2020

BPO 42260
Nosy @ncoghlan, @vstinner, @serhiy-storchaka, @zooba, @cmeyer, @shihai1991
PRs
  • bpo-42260: Reorganize PyConfig #23149
  • bpo-42260: Main init modify sys.flags in-place #23150
  • bpo-42260: Add _PyInterpreterState_SetConfig() #23158
  • bpo-42260: Add _PyConfig_FromDict() #23167
  • bpo-42260: PyConfig_Read() only parses argv once #23168
  • [WIP] bpo-42260: Rewrite getpath.c in Python #23169
  • bpo-42260: Compute the path config in the main init #23211
  • bpo-42260: Fix _PyConfig_Read() if compute_path_config=0 #23220
  • bpo-42260: Initialize warnings and time early at startup #23249
  • bpo-42260: Improve error handling in _PyConfig_FromDict #23488
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-09-21.21:56:32.186>
    created_at = <Date 2020-11-04.14:45:21.168>
    labels = ['expert-C-API', '3.10']
    title = '[C API] Add _PyInterpreterState_SetConfig(): reconfigure an interpreter'
    updated_at = <Date 2021-10-26.21:11:34.282>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2021-10-26.21:11:34.282>
    actor = 'steve.dower'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-09-21.21:56:32.186>
    closer = 'vstinner'
    components = ['C API']
    creation = <Date 2020-11-04.14:45:21.168>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 42260
    keywords = ['patch']
    message_count = 20.0
    messages = ['380327', '380328', '380341', '380382', '380421', '380424', '380657', '380658', '380659', '380707', '380728', '380773', '380775', '380780', '380781', '380823', '381711', '381718', '381782', '402376']
    nosy_count = 6.0
    nosy_names = ['ncoghlan', 'vstinner', 'serhiy.storchaka', 'steve.dower', 'cmeyer', 'shihai1991']
    pr_nums = ['23149', '23150', '23158', '23167', '23168', '23169', '23211', '23220', '23249', '23488']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue42260'
    versions = ['Python 3.10']

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2020

    This issue is a follow-up of the PEP-567 which introduced the PyConfig C API and is related to PEP-432 which wants to rewrite Modules/getpath.c in Python.

    I would like to add a new PyInterpreterState_SetConfig() function to be able to reconfigure a Python interpreter in C. One example is to write a custom sys.path, to implement of virtual environment (common request for embedded Python), etc. Currently, it's really complex to tune the Python configuration.

    The use case is to tune Python for embedded Python. First, I would like to add new functions to the C API for that:

    • PyInterpreterState_GetConfigCopy()
    • PyInterpreterState_SetConfig()

    The second step will to be expose these two functions in Python (I'm not sure where for now), and gives the ablity to tune the Python configuration in pure Python.

    The site module already does that for sys.path, but it is running "too late" in the Python initialization. Here the idea is to configure Python before it does access any file on disk, after the "core" initialization and before the "main" initialization.

    One concrete example would be to reimplement Modules/getpath.c in Python, convert it to a frozen module, and run it at Python startup to populate sys.path. It would allow to move some of the site code into this module to run it earlier.

    Pseudo-code in C:
    ---------------------

    void init_core(void)
    {
      // "Core" initialization
      PyConfig config;
      PyConfig_InitPython(&config);
      PyConfig._init_main = 0
      Py_InitializeFromc(&config);
      PyConfig_Clear(&config);
    }
    
    void tune_config(void)
    {
      PyConfig config;
      PyConfig_InitPython(&config);

    // Get a copy of the current configuration
    PyInterpreterState_GetConfigCopy(&config); // <== NEW API!

    // ... put your code to tune config ...

    // dummy example, current not possible in Python
    config.bytes_warnings = 1;

      // Reconfigure Python with the updated configuration
      PyInterpreterState_SetConfig(&config);  // <=== NEW API!
      PyConfig_Clear(&config);
    }
      
    int main()
    {
      init_core();
      tune_config(); // <=== THE USE CASE!
      _Py_InitializeMain();
      return Py_RunMain();
    }

    In this example, tune_config() is implemented in C. But later, it will be possible to convert the configuration to a Python dict and run Python code to tune the configuration.

    The PEP-587 added a "Multi-Phase Initialization Private Provisional API":

    • PyConfig._init_main = 0
    • _Py_InitializeMain()

    https://docs.python.org/dev/c-api/init_config.html#multi-phase-initialization-private-provisional-api

    @vstinner vstinner added 3.10 only security fixes topic-C-API labels Nov 4, 2020
    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2020

    New changeset cfb41e8 by Victor Stinner in branch 'master':
    bpo-42260: Reorganize PyConfig (GH-23149)
    cfb41e8

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2020

    New changeset af1d64d by Victor Stinner in branch 'master':
    bpo-42260: Main init modify sys.flags in-place (GH-23150)
    af1d64d

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 4, 2020

    New changeset 048a356 by Victor Stinner in branch 'master':
    bpo-42260: Add _PyInterpreterState_SetConfig() (GH-23158)
    048a356

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 5, 2020

    New changeset f3cb814 by Victor Stinner in branch 'master':
    bpo-42260: Add _PyConfig_FromDict() (GH-23167)
    f3cb814

    @vstinner
    Copy link
    Member Author

    vstinner commented Nov 5, 2020

    New changeset dc42af8 by Victor Stinner in branch 'master':
    bpo-42260: PyConfig_Read() only parses argv once (GH-23168)
    dc42af8

    @vstinner
    Copy link
    Member Author

    New changeset 9e1b828 by Victor Stinner in branch 'master':
    bpo-42260: Compute the path config in the main init (GH-23211)
    9e1b828

    @vstinner
    Copy link
    Member Author

    If we remove Modules/getpath.c, it will no longer be possible to automatically computes the path configuration when one of the following getter function will be called:

    • Py_GetPath()
    • Py_GetPrefix()
    • Py_GetExecPrefix()
    • Py_GetProgramFullPath()
    • Py_GetPythonHome()
    • Py_GetProgramName()

    It means that these functions would not return NULL if called before Python is initialiazed, but return the expected string once Python is initialized.

    Moreover, Py_SetPath() would no longer automatically computes the "program full path" (sys.executable).

    @vstinner
    Copy link
    Member Author

    If we remove Modules/getpath.c, it will no longer be possible to automatically computes the path configuration when one of the following getter function will be called: (...)

    It is not really an incompatible change according to the documentation:

    "Note: The following functions should not be called before Py_Initialize(): Py_EncodeLocale(), Py_GetPath(), Py_GetPrefix(), Py_GetExecPrefix(), Py_GetProgramFullPath(), Py_GetPythonHome(), Py_GetProgramName() and PyEval_InitThreads().".

    https://docs.python.org/dev/c-api/init.html

    @vstinner
    Copy link
    Member Author

    New changeset ace3f9a by Victor Stinner in branch 'master':
    bpo-42260: Fix _PyConfig_Read() if compute_path_config=0 (GH-23220)
    ace3f9a

    @vstinner
    Copy link
    Member Author

    The main drawback of rewriting Modules/getpath.c as Lib/_getpath.py (and removing getpath.c) is that PyConfig_Read() could no longer compute the Python Path Configuration. It would return an "empty" path configuration.

    @cmeyer
    Copy link
    Mannequin

    cmeyer mannequin commented Nov 11, 2020

    Responding to your request for feedback on Python-Dev:

    We embed Python dynamically by finding the libPython DLL, loading it, and looking up the required symbols. We make appropriate define's so that the Python headers (and NumPy headers) point to our functions which in turn point to the looked up symbols.

    Our launcher works on Linux, macOS, and Windows and works with many environments including standard Python and conda and brew. It also supports virtual environments in most cases. Also, a single executable [per platform] is able to work with Python versions 3.7 - 3.9 (3.6 was recently dropped, but only for external reasons).

    So my comment is not directly addressing the usefulness of configuring Python initialization - but I would like to request that this ability to dynamically load Python DLLs remains even with any new initialization mechanism.

    As another note, the main issues we run into are configuring the Python path to properly find packages and DLLs. A goal of ours is to be able to provide the base application as a drag-and-drop style installer with its own full embedded Python distribution (but still loaded dynamically) and then be able to supply additional plug-in packages (Python packages) by drag and drop. This is somewhat similar to conda packaging but without support for command line tools.

    @vstinner
    Copy link
    Member Author

    I would like to request that this ability to dynamically load Python DLLs remains even with any new initialization mechanism.

    I don't plan to remove any feature :-)

    As another note, the main issues we run into are configuring the Python path to properly find packages and DLLs.

    Do you mean sys.path? If yes, that's one of the goal of this issue. Allow you to write your own Python code to configure sys.path, rather than having to write C code, before the first (external) import.

    How do you configure sys.path currently? Do you parse a configuration file? Do you use a registry key on Windows?

    @cmeyer
    Copy link
    Mannequin

    cmeyer mannequin commented Nov 11, 2020

    How do you configure sys.path currently? Do you parse a configuration file? Do you use a registry key on Windows?

    We have several launch scenarios - but for the currently most common one, which is to launch using a separate, existing Python environment, we call Py_SetPythonHome and Py_SetPath with the home directory of the environment. Then, presumably, the more complete path gets set in either Py_Initialize or when we call PyImport_ImportModule(“sys”). I might have tracked the details down once, but I don't recall them. By the time our Python code starts running, sys.path is reasonably populated.

    However, in another scenario, we launch with an embedded Python environment, essentially a virtual environment. In that case, we have a config file to explicitly add lib, DLLs, and site packages. But something goes wrong [cannot find/load the unicode DLL IIRC] unless we call site.addsitedir for each directory already in sys.path near the start of our Python portion of code. My notes point to two issues to explain this: https://bugs.python.org/issue22213 and https://bugs.python.org/issue35706.

    @cmeyer
    Copy link
    Mannequin

    cmeyer mannequin commented Nov 11, 2020

    > I would like to request that this ability to dynamically load Python DLLs remains even with any new initialization mechanism.

    I don't plan to remove any feature :-)

    I am glad to hear that. I'm somewhat nervous about it nevertheless. In particular, the implementation of Py_DECREF changed from 3.7 to 3.8 to 3.9. 3.7 worked entirely in a header; but 3.8 had a quirky definition of _Py_Dealloc which used _Py_Dealloc_inline but was defined out of order (used before defined). This was somewhat addressed in https://github.com/python/cpython/pull/18361/files; however 3.9 now has another mechanism that defines _Py_Dealloc in Objects/object.c. This isn't a major problem because it has the same implementation as before, but changes like this have the potential to make the launcher binary be version specific. Again, not a deal breaker, but it still makes me nervous.

    @vstinner
    Copy link
    Member Author

    New changeset ef75a62 by Victor Stinner in branch 'master':
    bpo-42260: Initialize time and warnings earlier at startup (GH-23249)
    ef75a62

    @serhiy-storchaka
    Copy link
    Member

    Please don't use PyDict_GetItemString(), it will be deprecated. You can use _PyDict_GetItemStringWithError().

    Also always check the raised exception type before overwriting the exception, so you will not swallow MemoryError or other unexpected error.

    @serhiy-storchaka
    Copy link
    Member

    New changeset 14d81dc by Serhiy Storchaka in branch 'master':
    bpo-42260: Improve error handling in _PyConfig_FromDict (GH-23488)
    14d81dc

    @vstinner
    Copy link
    Member Author

    I opened a thread on python-dev about this issue:
    "Configure Python initialization (PyConfig) in Python"
    https://mail.python.org/archives/list/python-dev@python.org/thread/HQNFTXOCDD5ROIQTDXPVMA74LMCDZUKH/#X45X2K4PICTDJQYK3YPRPR22IGT2CDXB

    @vstinner
    Copy link
    Member Author

    The initial issue, adding an API to reconfigure an interepreter, is implemented: I added _PyInterpreterState_SetConfig().

    But I failed finding time to finish the larger project "rewrite getpath.c in Python" (PR 23169). It requires changing the C API of the PEP-587 which is not easy. Also I'm not fully convinced that there is a strong need to change getpath.c.

    I would be interested to move code from site.py to _getpath.py, but it's also not obvious if there a strong benefit.

    @vstinner vstinner changed the title [C API] Add PyInterpreterState_SetConfig(): reconfigure an interpreter [C API] Add _PyInterpreterState_SetConfig(): reconfigure an interpreter Sep 21, 2021
    @vstinner vstinner changed the title [C API] Add PyInterpreterState_SetConfig(): reconfigure an interpreter [C API] Add _PyInterpreterState_SetConfig(): reconfigure an interpreter Sep 21, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    facebook-github-bot pushed a commit to facebook/buck2-prelude that referenced this issue Feb 2, 2023
    Summary:
    With changes to the PyConfig api in 3.10 `PyConfig_SetBytesArgv` needs to be called prior to `PyConfig_Read` to properly pass arguments to the runtime with.
    Python 3.10 [changelog](https://docs.python.org/3.10/whatsnew/changelog.html#id173)
    > [bpo-42260](python/cpython#86426): The PyConfig_Read() function now only parses PyConfig.argv arguments once: PyConfig.parse_argv is set to 2 after arguments are parsed. Since Python arguments are strippped from PyConfig.argv, parsing arguments twice would parse the application options as Python options.
    
    Reviewed By: andrewjcg
    
    Differential Revision: D42762310
    
    fbshipit-source-id: d0e529ca00d48d5bbaa056f5a3f531631b28a178
    facebook-github-bot pushed a commit to facebook/buck2 that referenced this issue Feb 2, 2023
    Summary:
    With changes to the PyConfig api in 3.10 `PyConfig_SetBytesArgv` needs to be called prior to `PyConfig_Read` to properly pass arguments to the runtime with.
    Python 3.10 [changelog](https://docs.python.org/3.10/whatsnew/changelog.html#id173)
    > [bpo-42260](python/cpython#86426): The PyConfig_Read() function now only parses PyConfig.argv arguments once: PyConfig.parse_argv is set to 2 after arguments are parsed. Since Python arguments are strippped from PyConfig.argv, parsing arguments twice would parse the application options as Python options.
    
    Reviewed By: andrewjcg
    
    Differential Revision: D42762310
    
    fbshipit-source-id: d0e529ca00d48d5bbaa056f5a3f531631b28a178
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes topic-C-API
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants