Title: New 3.7 startup sequence crashes PyInstaller
Type: behavior Stage:
Components: Interpreter Core Versions: Python 3.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: eric.snow, htgoebel, ncoghlan, ned.deily, pmpp, springermac, vstinner
Priority: critical Keywords: patch

Created on 2018-03-10 17:00 by htgoebel, last changed 2018-03-17 14:41 by ncoghlan.

File name Uploaded Description Edit
bpo-33042-test-case.diff ncoghlan, 2018-03-12 12:28 New test_embed test case illustrating the reported crash
bpo-33042-earlier-tree-test-case.diff ncoghlan, 2018-03-14 11:45 Modified test case patch that applies to older 3.7.x dev branch commits
Messages (11)
msg313546 - (view) Author: Hartmut Goebel (htgoebel) Date: 2018-03-10 17:00
PyInstaller is a tool for freezing Python applications into stand-alone packages, much like py2exe. py2app, and bbfreeze. PyInstaller is providing *one* bootloader for all versions of Python supported (2.7, 3.4-3.6).

In PyInstaller the startup sequence is implemented in
pyi_pylib_start_python() in bootloader/src/pyi_pythonlib.c. The workflow roughly is:

- SetProgramName
- SetPythonHome
- Py_SetPath
- Setting runtime options
  - some flags using the global variables
  - PySys_AddWarnOption -> crash
- Py_Initialize
- PySys_SetPath

The crash occurs due to tstate (thread state) not being initialized when
calling PySys_AddWarnOption.
msg313643 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-12 12:28
I don't think we made any start-up changes that were specific to PySys_AddWarnOption, so my suspicion is that the crash is going to be related to a change in the constraints on either the unicode object creation or the list append operation.

The attached patch adds a new test case (I also cleaned up several other details related to test_embed execution in order to more easily see where the segfault happens on failure).

Unfortunately, it isn't yet suitable for use with `git bisect`, as I checked 3.7.0a3 (the earliest tag where the patch applied cleanly), and that commit also segfaults. So setting up for git bisect testing (as per the hot-fix example in ) will require:

* checking the commit where "" was extracted from "" and seeing whether or not that segfaults
* if it *doesn't* segfault, git bisect between there and v3.7.0a3
* if it *does* segfault, make a revised test patch that applies cleanly to early versions, and then go back through the 3.7.0 pre-releases to find one that works
msg313649 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-03-12 13:33
Python documentation was enhanced in Python 3.7 to explicitly list all functions safe to call *before* Py_Initialize():

PySys_AddWarnOption() is not part of the list. While it's not in the list, I'm kind of unhappy that we broke your use case: it wasn't my intent. Because I broken your use case with this change part of the big bpo-32030 refactoring:

commit f7e5b56c37eb859e225e886c79c5d742c567ee95
Author: Victor Stinner <>
Date:   Wed Nov 15 15:48:08 2017 -0800

    bpo-32030: Split Py_Main() into subfunctions (#4399)

IHMO the regression is that PySys_AddWarnOption() now calls _PySys_GetObjectId(): in Python 3.6, it wasn't the case.

Python 3.6 code:
PySys_AddWarnOptionUnicode(PyObject *unicode)
    if (warnoptions == NULL || !PyList_Check(warnoptions)) {
        warnoptions = PyList_New(0);
        if (warnoptions == NULL)
    PyList_Append(warnoptions, unicode);

Again, it's a bad idea to use the Python API before Py_Initialize(): you likely have to build a Unicode string and you use a list, whereas these two object types are not properly initialized...

The PEP 432 and bpo-32030 prepared Python to have a much better API in Python 3.8 for embedding Python. You will be able to use a wchar_t* string to pass warning options.
msg313651 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-12 14:02
Adding a bit of context from my prior email discussion with Hartmut: CPython actually reads sys.warnoptions at the end of Py_Initialize (its the last thing we do before the function returns).

It also reads sys._xoptions during startup, since that's one way of enabling settings like dev mode and UTF-8 mode.

That means that even though PySys_AddWarnOption and PySys_AddXOption weren't documented as being safe to call before Py_Initialize, they likely *won't* have the desired effect if an embedding application defers calling them until later, and their absence from the list of "safe to call before Py_Initialize" functions was itself a documentation bug.

For 3.8, I'd hoped that the resolution might be as simple as a hard requirement on embedding applications to call PyRuntime_Initialize() before doing *anything* with the C API, but including the "internal/pystate.h" header and adding a call to _PyRuntime_Initialize() wasn't enough to keep the draft test case in my patch from segfaulting :(
msg313815 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-14 10:35 (the first attempted fix for bpo-20891) is the last commit where the draft test case patch applies cleanly.

That still segfaults, so I'll jump all the way back to 3.7.0a1, rework the patch to apply there, and see if pre-init use of these APIs were still working back then.
msg313816 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-14 11:45
After reworking the patch to backport the pre-initialization embedding tests to 3.7.0a1, they *both* failed, as that was before Victor fixed Py_DecodeLocale to work prior to initialization again.

As a result, I tried in November, after the initial merge of the PEP 432 working branch, but just before the start of Victor's follow on refactoring to really take advantage of the new internal structures: double failure there as well. is the next point I checked (just before Eric's September changes to global state management), and it looks like we have a new chief suspect, as applying the attached "earlier-tree" version of the patch at this point in the history gives two passing test cases :)

Next I'll work out a git bisect run based on those two start and end commits and the "earlier-tree" patch, and attempt to narrow the failure down to a specific commit (probably tomorrow).

(Even without that bisect run though, the warnoptions changes in are now looking *very* suspicious)
msg313817 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-03-14 11:47
I suggest to revert my changes on PySys_AddWarnOption*() and "just" re-use the code from Python 3.6.

Previously, I made the code more complex since I had to expose a private function to add manually warning options. But later, the code changed again, so the private function is gone.
msg313823 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-14 14:22
While I haven't definitely narrowed it down yet, I suspect it isn't your changes that are the problem: since the reported crash relates to attempting to access a not-yet-created thread state, `warnoptions` and `_xoptions` likely need to become C level globals again for 3.7 (along with anything needed to create list and Unicode objects).

We can then transfer them from there into Eric's new runtime state objects in _PyRuntime_Initialize (or potentially _Py_InitializeCore).
msg313871 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-15 11:01
Patched test_capi results for 2ebc5ce42a8a9e047e790aefbf9a94811569b2b6 (the global state consolidation commit): both pre-initialization tests fail

Patched test_capi results for bab21faded31c70b142776b9a6075a4cda055d7f (the immediately preceding commit): both pre-initialization tests pass

Ding, ding, ding, we have a winner :)

Rather than reverting the options-related aspects of that change though, I think a nicer way to handle it will be to add a "pre-init fallback" path to those functions that stores the raw wchar_t data, and postpones converting those values to Py_Unicode objects until we need them in _PyInitialize_Core.

I'm not sure what to do about PySys_AddWarnOptionUnicode though: I think that's just outright incoherent as an API, since you need to call it before Py_Initialize for it to do anything useful, but we don't support calling any of the Unicode creation APIs until *after* Py_Initialize.
msg313903 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-03-15 18:38
Nick, what's your assessment of this issue's priority?  Is it a release blocker for 3.7.0b3 (proposed ABI freeze)?  For 3.7.0rc1 (code freeze)?  None of the above?
msg314006 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-03-17 14:41
This shouldn't affect the public ABI, so I don't think it's a blocker for the ABI freeze, but it's a regression that effectively makes PySys_AddWarnOption (and related APIs) unusable, so we should definitely fix it before shipping 3.7.

For now, I've marked the bug as critical, and I'll be aiming to have it fixed before 3.7.0b3 at the end of the month.

For PySys_AddWarnOption and PySys_ResetWarnOptions, my current plan is to adjust them to maintain a singly-linked list of "wchar *" pointers when called prior to _Py_InitializeCore (i.e. when "PyThreadState_GET() == NULL"), while using the existing 3.7.0b2 path when the thread state is available. _Py_InitializeCore will then take care of cleaning up the memory allocations while converting them to Python unicode entries in a Python list object.

_PySys_AddXOption would work similarly (just with a separate list).

For PySys_AddWarnOptionUnicode, I think we need to make it call Py_FatalError for 3.7 when called prior to Py_Initialize, with a recommendation to use PySys_AddWarnOption instead. In 3.8, that API will become useful again, once the public Py_InitializeCore API provides a supported multi-stage startup process that allows unicode objects to be created safely before Py_InitializeMainInterpreter (or whatever we end up calling that API) reads sys.warnoptions and configures the warning subsystem accordingly.
Date User Action Args
2018-03-17 14:41:01ncoghlansetpriority: normal -> critical

messages: + msg314006
2018-03-16 02:51:10pmppsetnosy: + pmpp
2018-03-15 18:38:56ned.deilysetnosy: + ned.deily
messages: + msg313903
2018-03-15 16:11:57springermacsetnosy: + springermac
2018-03-15 11:01:09ncoghlansetmessages: + msg313871
2018-03-14 14:22:32ncoghlansetmessages: + msg313823
2018-03-14 11:47:16vstinnersetmessages: + msg313817
2018-03-14 11:45:18ncoghlansetfiles: + bpo-33042-earlier-tree-test-case.diff

messages: + msg313816
2018-03-14 10:35:40ncoghlansetmessages: + msg313815
2018-03-12 14:02:34ncoghlansetmessages: + msg313651
2018-03-12 13:33:23vstinnersetmessages: + msg313649
2018-03-12 12:28:53ncoghlansetassignee: ncoghlan
2018-03-12 12:28:27ncoghlansetfiles: + bpo-33042-test-case.diff

nosy: + ncoghlan
messages: + msg313643

keywords: + patch
2018-03-10 18:47:25ned.deilysetnosy: + vstinner, eric.snow
2018-03-10 17:00:47htgoebelcreate