Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to disallow > 1 instance of an extension module #84780

Closed
vstinner opened this issue May 11, 2020 · 17 comments
Closed

Add option to disallow > 1 instance of an extension module #84780

vstinner opened this issue May 11, 2020 · 17 comments
Labels
3.10 only security fixes extension-modules C modules in the Modules dir topic-subinterpreters

Comments

@vstinner
Copy link
Member

BPO 40600
Nosy @terryjreedy, @vstinner, @encukou, @ericsnowcurrently, @corona10, @shihai1991, @koubaa
PRs
  • bpo-40600: Disallow loading extension modules load more than once in per interpreter. #23683
  • bpo-40600: atexit only loads single time per interpreter #23699
  • bpo-42639: Move atexit state to PyInterpreterState #23763
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-12-15.14:07:33.023>
    created_at = <Date 2020-05-11.22:12:29.442>
    labels = ['extension-modules', 'expert-subinterpreters', 'invalid', '3.10']
    title = 'Add option to disallow > 1 instance of an extension module'
    updated_at = <Date 2020-12-15.14:43:35.080>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2020-12-15.14:43:35.080>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-12-15.14:07:33.023>
    closer = 'petr.viktorin'
    components = ['Extension Modules', 'Subinterpreters']
    creation = <Date 2020-05-11.22:12:29.442>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40600
    keywords = ['patch']
    message_count = 17.0
    messages = ['368665', '368994', '376524', '376674', '376676', '376685', '382665', '382668', '382737', '382742', '382745', '382747', '382748', '382757', '382983', '383059', '383062']
    nosy_count = 7.0
    nosy_names = ['terry.reedy', 'vstinner', 'petr.viktorin', 'eric.snow', 'corona10', 'shihai1991', 'koubaa']
    pr_nums = ['23683', '23699', '23763']
    priority = 'normal'
    resolution = 'not a bug'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue40600'
    versions = ['Python 3.10']

    @vstinner
    Copy link
    Member Author

    When a C extension module is created with PyModuleDef_Init(), it becomes possible to create more than one instance of the module.

    It would take significant effort to modify some extensions to make their code fully ready to have two isolated module.

    For example, the atexit module calls _Py_PyAtExit() to register itself into the PyInterpreterState. If the module is created more than once, the most recently created module wins, and calls registered on other atexit instances are ignore: see bpo-40288.

    One simple option would be to simply disallow loading the module more than once per interpreter.

    Also, some extensions are not fully compatible with subinterpreters. It may be interesting to allow to load them in a subinterpreter if it's not already loaded in another interpreter, like another subinterpreter or the main interpreter. It would be only load it once per Python *process*. For example, numpy would be a good candidate for such option.

    I'm not sure fow a module should announced in its definition that it should not be loaded more than once.

    @vstinner vstinner added 3.9 only security fixes extension-modules C modules in the Modules dir labels May 11, 2020
    @terryjreedy
    Copy link
    Member

    Title clarified. Leaving subinterpreters aside, only one instance, AFAIK, is true for stdlib and python modules unless imported with different names, as can happen with main module (which is a nuisance if not a bug). So only once per interpreter seems like a bugfix. Only once per process, if once per interpreter otherwise becomes normal, seems like an enhancement, even if a necessary option.

    @terryjreedy terryjreedy changed the title Add an option to disallow creating more than one instance of a module Add option to disallow > 1 instance of an extension module May 16, 2020
    @koubaa
    Copy link
    Mannequin

    koubaa mannequin commented Sep 7, 2020

    What about a new PyModuleDef_Slot function?

    static int my_can_create(/*need any arg??, InterpreterState, etc?*/) {
        if (test_global_condition()) {
            return -1; //Don't allow creation
        }
        return 0; //Allow creation
    };
    
    static PyModuleDef_Slot signal_slots[] = {
        {Py_mod_exec, my_exec},
        {Py_mod_can_create, my_can_create},
        {0,0}
    };
    
    

    @corona10
    Copy link
    Member

    One of my opinions is that

    Since module object supports detecting error during Py_mod_exec,
    The only thing we have to do is return -1 when the module has already existed.

    we should define new exception type and don't have to define new slots.
    The pros of this method is that we don't have to modify module object and no needs to write the new PEP

    but the cons of this method is that we should promise the standard exception when try to create multiple instances which is not allowed.

    ref:

    ret = ((int (*)(PyObject *))cur_slot->value)(module);

    On the other side, defining a Py_mod_exec_once that supports execution for just once can be a way.
    Although the usage is little, it will be fine because the use case will exist.

    Please point out what I missed :)

    @vstinner
    Copy link
    Member Author

    One option is to get the behavior before multi-phase initialization. We store extensions in a list. Once it's load, it cannot be unloaded before we exit Python. See _PyState_AddModule() and _PyState_AddModule().

    Calling PyInit_xxx() the second time would simply return the existing module object.

    When we exit Python, the clear and/or free function of the module is called.

    @koubaa
    Copy link
    Mannequin

    koubaa mannequin commented Sep 10, 2020

    Something like this?

    static PyObject *def;
    
    PyMODINIT_FUNC
    PyInit_mymod(void)
    {
        if (def == NULL) {
           def = PyModuleDef_Init(&mymod);
        }
        return def;
    }
    

    Then add a flag to PyModuleDef to indicate it is already exec?

    @shihai1991
    Copy link
    Member

    On the other side, defining a Py_mod_exec_once that supports execution > >for just once can be a way.
    Although the usage is little, it will be fine because the use case will >exist.

    IMHO, Py_mod_exec_once is more like a slot control flag. MAYBE we need add a module flag in PyModuleDef.

    @shihai1991 shihai1991 added 3.10 only security fixes and removed 3.9 only security fixes labels Dec 7, 2020
    @shihai1991
    Copy link
    Member

    MAYBE we need add a module flag in PyModuleDef.
    I created a demo in PR 23683.

    @encukou
    Copy link
    Member

    encukou commented Dec 8, 2020

    Is it really necessary to add a slot/flag for this?
    It can be done in about 10 lines as below, without complications like a global (or per-interpreter) registry of singleton modules.
    Are there many modules that need to be per-interpreter singletons, but may be loaded in multiple interpreters? In my experience, it is really hard to ensure modules behave that way; if (if!) we need to add a dedicated API for this, I'd go for once-per-process.

    static int loaded = 0;
    
    static int
    exec_module(PyObject* module)
    {
        if (loaded) {
            PyErr_SetString(PyExc_ImportError,
                            "cannot load module more than once per process");
            return -1;
        }
        loaded = 1;
        // ... rest of initialization
    }

    @vstinner
    Copy link
    Member Author

    vstinner commented Dec 8, 2020

    static int loaded = 0;

    I would like to limit an extension to once instance *per interpreter*.

    See the atexit module for an example.

    @encukou
    Copy link
    Member

    encukou commented Dec 8, 2020

    Are there any other examples?

    In my view, atexit is very special, and very closely tied to interpreter. I don't think it's good to design general API for one module.

    @vstinner
    Copy link
    Member Author

    vstinner commented Dec 8, 2020

    I'm not 100% sure that preventing to create multiple module instances is needed. For the atexit module, another solution is to move the "module state" into the interpreter, as it has been done for other modules like _warnings.

    @corona10
    Copy link
    Member

    corona10 commented Dec 8, 2020

    another solution is to move the "module state" into the interpreter,

    I am +1 on this solution if this module is a very special case.

    @shihai1991
    Copy link
    Member

    another solution is to move the "module state" into the interpreter,

    OK, I am agree victor's solution too. It's a more simpler way.

    @vstinner
    Copy link
    Member Author

    I created bpo-42639 and PR 23763 to move the atexit module to PyInterpreterState and so allow to have more than one atexit instance.

    @encukou
    Copy link
    Member

    encukou commented Dec 15, 2020

    Thanks! Indeed, that's an even better solution than I had in mind.
    It follows PEP-630 quite nicely: https://www.python.org/dev/peps/pep-0630/#managing-global-state

    I will close this issue and PRs.
    I don't agree with adding a general API for disallowing multiple modules, but do let me know if you see a need for it again.

    @vstinner
    Copy link
    Member Author

    Thanks! Indeed, that's an even better solution than I had in mind.
    It follows PEP-630 quite nicely: https://www.python.org/dev/peps/pep-0630/#managing-global-state

    The atexit module was my main motivation for this issue. So I'm fine with closing it.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes extension-modules C modules in the Modules dir topic-subinterpreters
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants