Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyImport_GetModule() can return partially-initialized module #80124

Closed
pitrou opened this issue Feb 8, 2019 · 19 comments
Closed

PyImport_GetModule() can return partially-initialized module #80124

pitrou opened this issue Feb 8, 2019 · 19 comments
Assignees
Labels
3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@pitrou
Copy link
Member

pitrou commented Feb 8, 2019

BPO 35943
Nosy @brettcannon, @atsuoishimoto, @ncoghlan, @pitrou, @vstinner, @ericsnowcurrently, @pganssle, @pablogsal, @nanjekyejoannah, @cebtenzzre
PRs
  • bpo-35943: PyImport_GetModule() can return partially-initialized module #15057
  • Files
  • importerror-sample.tgz
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/nanjekyejoannah'
    closed_at = <Date 2021-03-16.16:17:49.860>
    created_at = <Date 2019-02-08.17:13:37.200>
    labels = ['interpreter-core', 'type-bug', '3.9', '3.10']
    title = 'PyImport_GetModule() can return partially-initialized module'
    updated_at = <Date 2021-03-16.16:17:49.859>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2021-03-16.16:17:49.859>
    actor = 'pitrou'
    assignee = 'nanjekyejoannah'
    closed = True
    closed_date = <Date 2021-03-16.16:17:49.860>
    closer = 'pitrou'
    components = ['Interpreter Core']
    creation = <Date 2019-02-08.17:13:37.200>
    creator = 'pitrou'
    dependencies = []
    files = ['49537']
    hgrepos = []
    issue_num = 35943
    keywords = ['patch']
    message_count = 19.0
    messages = ['335097', '335100', '351847', '356920', '356980', '357201', '360056', '360075', '360399', '362300', '362303', '362329', '362347', '379441', '383433', '388846', '388847', '388849', '388857']
    nosy_count = 13.0
    nosy_names = ['brett.cannon', 'ishimoto', 'gjb1002', 'ncoghlan', 'pitrou', 'vstinner', 'eric.snow', 'Big Stone', 'p-ganssle', 'pablogsal', 'nanjekyejoannah', 'Valentyn Tymofieiev', 'cebtenzzre']
    pr_nums = ['15057']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue35943'
    versions = ['Python 3.9', 'Python 3.10']

    @pitrou
    Copy link
    Member Author

    pitrou commented Feb 8, 2019

    PyImport_GetModule() returns whatever is in sys.modules, even if the module is still importing and therefore only partially initialized.

    One possibility is to reuse the optimization already done in PyImport_ImportModuleLevelObject():

            /* Optimization: only call _bootstrap._lock_unlock_module() if
               __spec__._initializing is true.
               NOTE: because of this, initializing must be set *before*
               stuffing the new module in sys.modules.
             */
            spec = _PyObject_GetAttrId(mod, &PyId___spec__);
            if (_PyModuleSpec_IsInitializing(spec)) {
                PyObject *value = _PyObject_CallMethodIdObjArgs(interp->importlib,
                                                &PyId__lock_unlock_module, abs_name,
                                                NULL);
                if (value == NULL) {
                    Py_DECREF(spec);
                    goto error;
                }
                Py_DECREF(value);
            }
            Py_XDECREF(spec);

    Issue originally mentioned in bpo-34572.

    @pitrou pitrou added 3.7 (EOL) end of life 3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Feb 8, 2019
    @ericsnowcurrently
    Copy link
    Member

    Yeah, that makes sense.

    @brettcannon
    Copy link
    Member

    New changeset 37c2220 by Brett Cannon (Joannah Nanjekye) in branch 'master':
    bpo-35943: Prevent PyImport_GetModule() from returning a partially-initialized module (GH-15057)
    37c2220

    @ValentynTymofieiev
    Copy link
    Mannequin

    ValentynTymofieiev mannequin commented Nov 18, 2019

    Do we plan to backport the change by nanjekyejoannah to 3.7 branch?

    @brettcannon
    Copy link
    Member

    I've assigned this to Joannah to decide if she wants to backport this.

    @ValentynTymofieiev
    Copy link
    Mannequin

    ValentynTymofieiev mannequin commented Nov 21, 2019

    Thanks. Is it possible that this issue and https://bugs.python.org/issue38884 are duplicates?

    @nanjekyejoannah
    Copy link
    Member

    The changes required to successfully do this backport are many and affect critical areas. I am not in a hurry to do this. If anyone else wants to take this up quickly, please do.

    @vstinner
    Copy link
    Member

    The changes required to successfully do this backport are many and affect critical areas. I am not in a hurry to do this. If anyone else wants to take this up quickly, please do.

    Do you mean that there is a risk that the backport introduces a regression in another part of the code? If yes, I would suggest to not backport the change to *stable* branches.

    People survived with bug. Do you really *have to* backport the fix?

    Note: this issue is closed. If you consider to backport it, I suggest to reopen the issue.

    @nanjekyejoannah
    Copy link
    Member

    Do you mean that there is a risk that the backport introduces a regression in another part of the code? If yes, I would suggest to not backport the change to *stable* branches.

    My worry are the many changes that are required to ceval to make this back port work. Not that I think we can not successfully backport things. we can.

    @gjb1002
    Copy link
    Mannequin

    gjb1002 mannequin commented Feb 20, 2020

    I have been experiencing what I thought was this issue in my embedded Python code. We have been using Python 3.7, so I thought upgrading to 3.8.1 would fix it, but it doesn't seem to have made any difference.

    My C++ code essentially can call PyImport_GetModule() from two threads simultaneously on the same module A. The symptoms I see are that one of them then gets a stacktrace in module B (imported by A), saying that some symbol defined near the end of B does not exist.

    I've also noticed that this happens far more often on deployed code (where Python modules end up in a zip file) than when run directly in development (where the modules are just normal files). I can't see any difference in the frequency between 3.7.5 and 3.8.1.

    Any ideas? Should I reopen this?

    @gjb1002
    Copy link
    Mannequin

    gjb1002 mannequin commented Feb 20, 2020

    Oops, I mean we call PyImport_ImportModule and get these issues when the files are zipped. Unless that calls PyImport_GetModule internally I guess it's not related to this then.

    @ValentynTymofieiev
    Copy link
    Mannequin

    ValentynTymofieiev mannequin commented Feb 20, 2020

    @gjb1002: see also https://bugs.python.org/issue38884, which demonstrates that concurrent imports are not thread-safe on Python 3.

    @gjb1002
    Copy link
    Mannequin

    gjb1002 mannequin commented Feb 20, 2020

    @Valentyn Tymofieiev - true, and thanks for the tip, though the symptoms described there are somewhat different from what I'm observing. Also, my problem seems to be dependent on zipping the Python code, which that one isn't.

    @atsuoishimoto
    Copy link
    Mannequin

    atsuoishimoto mannequin commented Oct 23, 2020

    After this fix, some functions like multiprocessing.Pool cannot be used in threaded code(https://bugs.python.org/issue41567).

    importerror-sample.tgz contains simplified code to reproduce the same error without multiprocessing module. Is this an expected behaviour of this change?

    Tested with Python 3.9.0/macOS 10.15.5.

    @BigStone
    Copy link
    Mannequin

    BigStone mannequin commented Dec 20, 2020

    Is this bug causing the Dask-Jupyterlab failure ?
    dask/distributed#4168

    @pitrou
    Copy link
    Member Author

    pitrou commented Mar 16, 2021

    Note the conjunction of this change + bpo-32596 produces import fragility:
    https://bugs.python.org/issue43515

    @pitrou
    Copy link
    Member Author

    pitrou commented Mar 16, 2021

    Ok, going through other open issues including on third-party projects, I think these changes should unfortunately be reverted. The regressions produced are far from trivial and most developers seem at a loss how to fix them.

    @pitrou pitrou added 3.9 only security fixes 3.10 only security fixes and removed 3.7 (EOL) end of life 3.8 only security fixes labels Mar 16, 2021
    @pitrou pitrou reopened this Mar 16, 2021
    @pitrou
    Copy link
    Member Author

    pitrou commented Mar 16, 2021

    After analysis, it may not need reversal. There is a simple logic error it seems. Will check.

    @pitrou
    Copy link
    Member Author

    pitrou commented Mar 16, 2021

    Created a new issue + fix in bpo-43517.

    @pitrou pitrou closed this as completed Mar 16, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants