Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we support calling Py_Main() after Py_Initialize()? #78189

Closed
ncoghlan opened this issue Jun 30, 2018 · 18 comments
Closed

Do we support calling Py_Main() after Py_Initialize()? #78189

ncoghlan opened this issue Jun 30, 2018 · 18 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes type-bug An unexpected behavior, bug, or error

Comments

@ncoghlan
Copy link
Contributor

BPO 34008
Nosy @ncoghlan, @abalkin, @vstinner, @mcepl, @ericsnowcurrently, @emilyemorehouse, @hroncok
PRs
  • bpo-34206: Improve docs and test coverage for pre-init functions #8023
  • bpo-34008: Allow to call Py_Main() after Py_Initialize() #8043
  • [3.7] bpo-34008: Allow to call Py_Main() after Py_Initialize() (GH-8043) #8352
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-07-24.13:39:09.035>
    created_at = <Date 2018-06-30.06:58:41.226>
    labels = ['3.8', 'type-bug', '3.7']
    title = 'Do we support calling Py_Main() after Py_Initialize()?'
    updated_at = <Date 2018-07-24.13:39:09.034>
    user = 'https://github.com/ncoghlan'

    bugs.python.org fields:

    activity = <Date 2018-07-24.13:39:09.034>
    actor = 'ncoghlan'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-07-24.13:39:09.035>
    closer = 'ncoghlan'
    components = []
    creation = <Date 2018-06-30.06:58:41.226>
    creator = 'ncoghlan'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 34008
    keywords = ['patch']
    message_count = 18.0
    messages = ['320758', '320760', '320768', '320771', '320772', '320773', '320857', '321036', '321384', '321565', '322006', '322007', '322011', '322017', '322019', '322041', '322044', '322302']
    nosy_count = 7.0
    nosy_names = ['ncoghlan', 'belopolsky', 'vstinner', 'mcepl', 'eric.snow', 'emilyemorehouse', 'hroncok']
    pr_nums = ['8023', '8043', '8352']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue34008'
    versions = ['Python 3.7', 'Python 3.8']

    @ncoghlan
    Copy link
    Contributor Author

    In the current documentation, Py_Main is listed as a regular C API function: https://docs.python.org/3/c-api/veryhigh.html#c.Py_Main

    We also didn't add Py_Main() to https://docs.python.org/3/c-api/init.html#before-python-initialization when we did the review for "Functions which are safe to call before Py_Initialize()".

    If it truly is a regular C API function, then this suggests that Py_Main() should be called *after* Py_Initialize().

    However, Py_Main() has historically called Py_Initialize() *itself*, and generally assumes it has total control over management of the current process - it doesn't expect to be sharing the current process with an embedding application.

    In Python 3.7, Py_Main() was changed to call _Py_InitializeCore() and _Py_InitializeMainInterpreter() as separate steps, allowing it to detect cases where an embedding application had already called Py_Initialize(), meaning that Py_Main()'s config setting changes wouldn't be applied.

    In effect, we've converted a silent failure (settings not read and applied as expected) into a noisy one (attempting to call Py_Main() in an already initialised interpreter).

    (The test suite is silent on the matter, since the idea of an embedding application calling Py_Initialize() before calling Py_Main() never even occurred to us)

    So the question is, what should we do about this for Python 3.7:

    1. Treat it as a regression, try to replicate the old behaviour of partially (but not completely) ignoring the config settings read by Py_Main()
    2. Treat it as a regression, but don't try to replicate the old config-settings-are-partially-applied behaviour - just ignore the settings read by Py_Main() completely
    3. Treat it as a long standing documentation error, update the docs to make it clear that Py_Main() expects to handle calling Py_Initialize() itself, and update the Python 3.7 porting guide accordingly

    @ncoghlan ncoghlan added 3.7 (EOL) end of life 3.8 only security fixes type-bug An unexpected behavior, bug, or error labels Jun 30, 2018
    @hroncok
    Copy link
    Mannequin

    hroncok mannequin commented Jun 30, 2018

    This hits fontforge. See https://bugzilla.redhat.com/show_bug.cgi?id=1595421

    An example of how it should be handled there if 3) is selected would be greatly appreciated. Thanks

    @vstinner
    Copy link
    Member

    Even if it's ugly, calling Py_Main() after Py_Initialize() works in Python
    3.6 by ignoring the new config of Py_Main(). I suggest to do nothing (just
    return) when Py_Initialize() is called again by Py_Main() in Python 3.7. It
    would fix the fontforge *regression*.

    I suggest to only enhance the code (really apply the new Py_Main() config)
    in the master branch. From what I saw, apply the new config is not trivial.
    It will take time to decide what to do for *each* option.

    @ncoghlan
    Copy link
    Contributor Author

    Yeah, I've asked Miro to try essentially that in https://bugzilla.redhat.com/show_bug.cgi?id=1595421#c12

    My concern is that Py_Main used to keep a *lot* of state on the local C stack that we've now moved into the main interpreter config.

    So my suspicion is that even that change I've asked Miro to try may not do the right thing, but it really depends on *why* fontforge is calling Py_Main, and what they're passing in argc and argv.

    Given that the embedded Py_Initialize is a no-op, they may be better off just copying out the parts they actually need from the code execution aspects of it, and skipping the Py_Main call entirely.

    @ncoghlan
    Copy link
    Contributor Author

    At the very least, I think this calls for a documentation change, such that we make it clear that:

    1. The expected calling pattern is to call Py_Main() *instead of* Py_Initialize(), since Py_Main() calls Py_Initialize().

    2. If you do call Py_Initialize() first, then exactly which settings that Py_Main() will read from the environment and apply is version dependent.

    So I'll put together a docs patch that's valid for 3.6+, and then we can decide where to take 3.7 and 3.8 from an implementation perspective after that.

    Currently, I'm thinking that for 3.7, our goal should specifically be "fake the 3.6 behaviour well enough to get fontforge working again" (so it will mostly be Fedora that drives what that 3.7.1 change looks like), while for 3.8 we should aim to deprecate that code path, and instead offer a better building block API for folks that want to implement their own Py_Main() variant.

    @ncoghlan
    Copy link
    Contributor Author

    Removing 3.6 from the affected versions, since we re-arranged these docs a bit for 3.7 to make it clearer which functions could be called prior to initialization, as well as adding more tests for the actual pre-init functionality.

    Since 3.6 isn't going to change, and we'll be aiming to get back to something a bit closer to the 3.6 behaviour for 3.7.1, it seems simpler to just focus on 3.7 and 3.8.

    @vstinner
    Copy link
    Member

    vstinner commented Jul 2, 2018

    My PR 8043 fix the Python 3.7 regression: allow again to call Py_Main() after Py_Initialize().

    Nick: if you want to raise an error on that case, I would prefer to see a deprecation warning emitted in version N before raising an error in version N+1.

    Technically, it seems possible to allow calling Py_Main() after Py_Initialize(). It's "just" a matter of fixing Py_Main() to apply properly the new configuration.

    @ncoghlan
    Copy link
    Contributor Author

    ncoghlan commented Jul 4, 2018

    Aye, and I think from Miro's comments on your PR, making it apply some of the same configuration settings that 3.6 and earlier applied is going to be required to get fontforge working again for 3.7.

    At the very least, the call to set sys.argv is needed.

    @ncoghlan
    Copy link
    Contributor Author

    Thinking about this some more, I'm inclined to go the same way we did with bpo-33932: classify it as an outright regression, work out the desired requirements for a missing embedding test case, and then fix the implementation to pass the new test case.

    My suggestion for test commands would be:

    ../Lib/site.py
    -m site
    -c 'import sys; print(sys.argv)' some test data here
    

    Once those work properly, we'd consider the regression relative to Python 3.6 fixed.

    Those could either be 3 different test cases, or else we could run them all within a single test case, with a Py_Initialize() call before each one (since Py_Main() calls Py_Finalize() internally).

    @vstinner
    Copy link
    Member

    I tested fontforge. It became clear to me that the fontforge must be able to call Py_Initialize() and only later call Py_Main(). fontforge calls Py_Initialize() and then uses the Python API:

    /* This is called to start up the embedded python interpreter */
    void FontForge_InitializeEmbeddedPython(void) {
        // static int python_initialized is declared above.
        if ( python_initialized )
    	return;
    
        SetPythonProgramName("fontforge");
        RegisterAllPyModules();
        Py_Initialize();
        python_initialized = 1;
    /* The embedded python interpreter is now functionally
     * "running". We can modify it to our needs.
     \*/
    CreateAllPyModules();
    InitializePythonMainNamespace();
    

    }

    Py_Main() is called after FontForge_InitializeEmbeddedPython().

    To be clear, Py_Main() must be fixed in Python 3.7 to apply properly the new configuration, or *at least* define sys.argv.

    @vstinner
    Copy link
    Member

    This hits fontforge. See https://bugzilla.redhat.com/show_bug.cgi?id=1595421

    Copy of my comment 20:

    I looked one more time to the issue:

    • fontforge code is fine: it calls Py_Initialize() before using the Python C API to initialize its Python namespace (modules, functions, types, etc.), and later it calls Py_Main().

    • Python 3.7.0 is wrong: it fails with a fatal erro when Py_Main() is called if Py_Initialize() has been called previously. It is a regression since it worked fine in Python 3.6.

    • My PR bpo-34008: Allow to call Py_Main() after Py_Initialize() #8043 works again the simplest test: "fontforge -script hello.py" where hello.py is just "print('Hello World!')"

    • This PR is not enough according to Miro, since fontforge requires sys.argv to be set, whereas sys.argv is no longer set when Py_Main() is after Py_Initialize(): bpo-34008: Allow to call Py_Main() after Py_Initialize() #8043 (comment)

    @vstinner
    Copy link
    Member

    I looked one more time at Python 3.6, code before my huge Py_Main()/Py_Initialize() refactoring, before _PyCoreConfig/_PyMainInterpreterConfig have been added. In short, calling Py_Main() after Py_Initialize() "works" in Python 3.6 but only sys.argv is set: almost all other options are lost/ignored. For example, if you call Py_Initialize() you get a sys.path from the current environment, but if Py_Main() gets a new module search path: sys.path is not updated.

    I modified PR 8043 to write the minimum patch just to fix fontforge. When Py_Main() is called Py_Initialize(): just set sys.argv, that's all.

    If someone wants to fix this use case, apply properly the new Py_Main() configuration carefully, I would suggest to only work in the master branch: that's out of the scope of this specific regression.

    IMHO it would be nice to enhance this use case. For example, it would be "nice" to update at least "sys.path" (and other related variables) in such case.

    @ncoghlan
    Copy link
    Contributor Author

    +1 from me for the idea of fixing Python 3.7 to correctly set sys.argv in this case (and leaving everything else unmodified).

    As far as further enhancement in Python 3.8 goes, perhaps that can be seen as part of PEP-432, such that embedding applications have to explicitly opt-in to any new behaviour by calling the new APIs?

    @vstinner
    Copy link
    Member

    I created bpo-34170: "Py_Initialize(): computing path configuration must not have side effect (PEP-432)" as a follow-up of this issue.

    @vstinner
    Copy link
    Member

    New changeset fb47bca by Victor Stinner in branch 'master':
    bpo-34008: Allow to call Py_Main() after Py_Initialize() (GH-8043)
    fb47bca

    @vstinner
    Copy link
    Member

    New changeset 03ec4df by Victor Stinner (Miss Islington (bot)) in branch '3.7':
    bpo-34008: Allow to call Py_Main() after Py_Initialize() (GH-8043) (GH-8352)
    03ec4df

    @vstinner
    Copy link
    Member

    Ok, I fixed the fontforge bug in 3.7 and master branches. Sorry for the regression, but I really didn't expect that anyone would call Py_Main() after Py_Initialize() :-) I guess we should be prepared for such hiccup when reworking the Python initialization :-)

    Nick: Can I close the issue, or do you want to keep it open to change the documentation?

    @ncoghlan
    Copy link
    Contributor Author

    Since we decided the correct resolution here was to restore the Python 3.6 behaviour, I've filed https://bugs.python.org/issue34206 as a separate docs clarification issue (I'll amend my PR accordingly)

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants