Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling Py_Initialize() twice now triggers a fatal error (Python 3.7) #78113

Closed
vstinner opened this issue Jun 21, 2018 · 26 comments
Closed

Calling Py_Initialize() twice now triggers a fatal error (Python 3.7) #78113

vstinner opened this issue Jun 21, 2018 · 26 comments
Labels
3.7 (EOL) end of life easy interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@vstinner
Copy link
Member

BPO 33932
Nosy @ncoghlan, @vstinner, @ned-deily, @ericsnowcurrently, @zhangyangyu, @emilyemorehouse, @hroncok, @miss-islington
PRs
  • bpo-33932: Calling Py_Initialize() twice does nothing #7845
  • [3.7] bpo-33932: Calling Py_Initialize() twice does nothing (GH-7845) #7859
  • bpo-33932: Deprecate repeated calls to Py_Initialize[Ex] #7967
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-06-30.07:01:22.655>
    created_at = <Date 2018-06-21.13:08:48.107>
    labels = ['interpreter-core', 'easy', 'type-bug', '3.7']
    title = 'Calling Py_Initialize() twice now triggers a fatal error (Python 3.7)'
    updated_at = <Date 2018-06-30.07:17:14.818>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2018-06-30.07:17:14.818>
    actor = 'ncoghlan'
    assignee = 'docs@python'
    closed = True
    closed_date = <Date 2018-06-30.07:01:22.655>
    closer = 'ncoghlan'
    components = ['Interpreter Core']
    creation = <Date 2018-06-21.13:08:48.107>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 33932
    keywords = ['patch', 'easy']
    message_count = 26.0
    messages = ['320175', '320180', '320182', '320183', '320184', '320220', '320223', '320225', '320226', '320227', '320244', '320248', '320255', '320562', '320578', '320579', '320582', '320634', '320641', '320665', '320666', '320667', '320725', '320755', '320759', '320761']
    nosy_count = 9.0
    nosy_names = ['ncoghlan', 'vstinner', 'ned.deily', 'docs@python', 'eric.snow', 'xiang.zhang', 'emilyemorehouse', 'hroncok', 'miss-islington']
    pr_nums = ['7845', '7859', '7967']
    priority = None
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue33932'
    versions = ['Python 3.7']

    @vstinner
    Copy link
    Member Author

    It would help to document that calling Py_Initialize() twice in Python 3.7 now triggers a fatal error, whereas previous the second call did nothing. Document it at:
    https://docs.python.org/dev/whatsnew/3.7.html#changes-in-the-c-api

    ... Or should it be considered as a regression?

    --

    My colleague Miro Hrončok reported a Python crash (SIGABRT) when running https://github.com/konlpy/konlpy test suite on Python 3.7:

    "Fatal Python error: _Py_InitializeCore: main interpreter already initialized"

    konlpy uses the JPype project, the bug is in JPype initialization function (it's a C extension):

    @vstinner vstinner added docs Documentation in the Doc dir 3.7 (EOL) end of life labels Jun 21, 2018
    @vstinner vstinner added the easy label Jun 21, 2018
    @vstinner
    Copy link
    Member Author

    I set the priority to release blocker to make sure that someone looks at this issue, even if I'm not sure that it's a huge regression.

    @vstinner
    Copy link
    Member Author

    The behaviour changed has been introduced by bpo-22257:

    commit 1abcf67
    Author: Eric Snow <ericsnowcurrently@gmail.com>
    Date: Tue May 23 21:46:51 2017 -0700

    bpo-22257: Private C-API for core runtime initialization (PEP-432). (bpo-1772)
    
    (patch by Nick Coghlan)
    

    Simplified change:

    • if (initialized)
    •    return;
      
    • initialized = 1;

    + if (_Py_Initialized) {
    + Py_FatalError("Py_InitializeCore: main interpreter already initialized");
    + }

    @vstinner vstinner added interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed docs Documentation in the Doc dir labels Jun 21, 2018
    @vstinner
    Copy link
    Member Author

    Attached PR 7845 restores the Python 3.6 behaviour: calling Py_Initialize() twice does nothing.

    @vstinner
    Copy link
    Member Author

    Note: the doc doesn't say that the function must only be called once, nor document the (current) new Python 3.7 restriction (must only be called once):
    https://docs.python.org/dev/c-api/init.html#c.Py_Initialize

    @zhangyangyu
    Copy link
    Member

    Note: the doc doesn't say that the function must only be called once, nor document the (current) new Python 3.7 restriction (must only be called once):

    But the doc says:

    This is a no-op when called for a second time (without calling Py_FinalizeEx() first).

    no-op means it could be called twice?

    @vstinner
    Copy link
    Member Author

    "no-op means it could be called twice" oops right :-) So it's a regression. Would you mind to review my PR 7845, Xiang?

    @ncoghlan
    Copy link
    Contributor

    While I agree the documentation issue means that this should be made to work again for Python 3.7.1, I'd like for Python 3.7 to at least deprecate calling Py_Initialize twice without an intervening Py_Finalize.

    Otherwise the public multi-phase interpreter initialization API is going to have to cope with somebody starting the initialization process, and then calling a full Py_Initialize to finish it (instead of the new "complete initialization" API, whatever we end up calling that), and I'd prefer to keep the permitted state transitions more linear than that.

    @vstinner
    Copy link
    Member Author

    While I agree the documentation issue means that this should be made to work again for Python 3.7.1, I'd like for Python 3.7 to at least deprecate calling Py_Initialize twice without an intervening Py_Finalize.

    In term of timeline, I would prefer to fix the regression right now, and then see how to deprecate calling the function later :-)

    @ncoghlan
    Copy link
    Contributor

    Aye, I'm fine with deferring the programmatic deprecation for now - it's mainly the docs I'd like to see changed at the same time as the regression is fixed.

    @vstinner
    Copy link
    Member Author

    New changeset 209abf7 by Victor Stinner in branch 'master':
    bpo-33932: Calling Py_Initialize() twice does nothing (GH-7845)
    209abf7

    @miss-islington
    Copy link
    Contributor

    New changeset 3747dd1 by Miss Islington (bot) in branch '3.7':
    bpo-33932: Calling Py_Initialize() twice does nothing (GH-7845)
    3747dd1

    @ned-deily
    Copy link
    Member

    Since the undocumented change in behavior from 3.6 to 3.7 appears to be causing some downstream regressions at this late stage, I think Victor's change should be cherry-picked for 3.7.0 final.

    Nick or Eric, if you merge a 3.7 doc change, I'll cherry-pick that as well.

    @vstinner
    Copy link
    Member Author

    Another example of project affected by this regression: fontforge
    https://bugzilla.redhat.com/show_bug.cgi?id=1595421

    @ncoghlan
    Copy link
    Contributor

    #7967 amends the docs and adds the deprecation notices, but after taking another look at the way Victor's patch works, I'm thinking we may not need them: the new initialization API remains strict about the required state machine transitions, which means that calling Py_Initialize when only the core has been initialized will fail at the _Py_InitializeCore step (as that complains if the core has already been initialized).

    Similarly, calling _Py_InitializeCore after Py_Initialize still fails, as the *only* case that becomes a no-op is specifically Py_InitializeEx, *after* the interpreter is fully initialized.

    Given that, I think we can just keep Victor's compatibility fix indefinitely, and only have the new incremental initialization APIs be strict about the expected transitions through the state machine.

    @ncoghlan
    Copy link
    Contributor

    I went ahead and closed my PR, as I'm now happy that keeping this behaviour isn't going to block anything we want to do for PEP-432.

    If we change our minds and decide we want to deprecate the current behaviour after all, then we can start that deprecation process in Python 3.8.

    @ncoghlan ncoghlan added the type-bug An unexpected behavior, bug, or error label Jun 27, 2018
    @vstinner
    Copy link
    Member Author

    Please keep it open until 3.7.0 is released.

    @vstinner vstinner reopened this Jun 27, 2018
    @ned-deily
    Copy link
    Member

    New changeset b940921 by Ned Deily (Miss Islington (bot)) in branch '3.7':
    bpo-33932: Calling Py_Initialize() twice does nothing (GH-7845)
    b940921

    @ned-deily
    Copy link
    Member

    Fixed in 3.7.0. Thanks, everyone!

    @vstinner
    Copy link
    Member Author

    I reopen the issue because fontforge still fails on Python 3.7.0 final. fontforge calls Py_Initialize() and then calls Py_Main() which fails with an assertion: https://bugzilla.redhat.com/show_bug.cgi?id=1595421

    The same code works well on Python 3.6, so should we define this issue as a regression?

    Call stack:

    • Py_Main()
    • pymain_main()
    • _Py_InitializeCore() which fails

    The problem is that the first call of Py_Initialize() sets a first configuration, but then Py_Main() sets a different configuration.

    Is it correct to call Py_Initialize() before Py_Main()?

    @vstinner vstinner reopened this Jun 28, 2018
    @vstinner
    Copy link
    Member Author

    I looked quickly at _Py_InitializeCore() and I'm not sure that it's designed to replace an existing configuration.

    Example:

    int Py_HashRandomizationFlag = 0; /* for -R and PYTHONHASHSEED */

    (...)

        if (!core_config->use_hash_seed || core_config->hash_seed) {
            /* Random or non-zero hash seed */
            Py_HashRandomizationFlag = 1;
        }

    If Py_Initialize() sets Py_HashRandomizationFlag to 1, but Py_Main() doesn't use an hash seed, Py_HashRandomizationFlag value remains 1 which is wrong.

    pymain_read_conf() fills _PyCoreConfig from global configuration flags, but then global configuration flags are supposed to be set from the global configuration flags.

    So maybe for this specific case, _Py_InitializeCore() should always set Py_HashRandomizationFlag. But it was just one example to show that right now, _Py_InitializeCore() doesn't seem to support to be called twice with two different configurations.

    @vstinner
    Copy link
    Member Author

    Ok, now with my regression hat.

    In Python 3.6, Py_Main() calls Py_Initialize(). If Py_Initialize() is called twice, the second call does nothing. So it's fine to call Py_Main() after Py_Initialize() in Python 3.6.

    @ericsnowcurrently
    Copy link
    Member

    I looked quickly at _Py_InitializeCore() and I'm not sure
    that it's designed to replace an existing configuration.
    ...
    So maybe for this specific case, _Py_InitializeCore() should
    always set Py_HashRandomizationFlag.

    +1

    At the same time, the use of Py_*Flag variables is a little unclear in the PEP-432 world. Under the PEP I'd expect code in CPython to use the core config from the current PyInterpreterState (or even _PyRuntimeState), rather than the global flag variables. So presumably (the PEP doesn't talk about it) the intent of keeping them is to help folks that currently make use of the flags. However, they shouldn't be modifying them after initialization, and under PEP-432 they don't need to set them beforehand (they set values on the config), right?

    Part of the problem here is that Py_Main() results in all the flag variables getting set (before runtime initialization). However, with the current PEP-432 [internal] implementation those variables are (mostly) never getting set, right? Shouldn't we set all of them, but only at the end of _Py_InitializeCore() (for the sake of extensions/embedders that use them)? In that case we'd need to replace our internal use of Py_HashRandomizationFlag with field on the config struct on _PyRuntimeState.

    But it was just one example to show that right now,
    _Py_InitializeCore() doesn't seem to support to be called
    twice with two different configurations.

    Are there other things (than how we set that flag) that break in that case? Regardless, what would ensure that it works properly?

    @ncoghlan
    Copy link
    Contributor

    Victor, could you file a separate issue for Py_Main(). It's a different problem, since Py_Main() is an entry point for an *application that embeds Python*. It's not a CPython API in the tradition sense.

    @ncoghlan
    Copy link
    Contributor

    I filed https://bugs.python.org/issue34008 to cover the fact that "Py_Initialize(); Py_Main();" doesn't work in 3.7.0 whereas it used to sort of work (by only partially applying the settings read from the CLI and environment by the Py_Main() call).

    Since "Py_Initialize(); Py_Initialize();" *does* work in 3.7.0, closing this one again.

    @ncoghlan
    Copy link
    Contributor

    Answering Eric's question about the flag variables though:

    • pymain_get_global_config copies the global variable flags to their respective struct entries
    • pymain_set_global_config copies the various struct entries to their respective global variable flags

    One of the first things that Py_Main() does in 3.7 is to:

    • read the global variable flags into the relevant structs
    • update the structs based on the CLI options
    • write the struct values back to the global flags

    This mostly fakes the 3.6-and-earlier behaviour of using the global variables directly.

    For anything listed in https://docs.python.org/3/c-api/init.html#global-configuration-variables, we also mostly left any code reading the global variable directly alone, since they're considered a public API for embedding applications to dynamically adjust behaviour and we don't have a good way of deprecating using them for that purpose :(

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life easy interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants