This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: Add _PyPreConfig and rework _PyCoreConfig and _PyMainInterpreterConfig
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.8
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, lukasz.langa, ncoghlan, vstinner
Priority: normal Keywords: patch

Created on 2018-11-16 14:44 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 10575 closed vstinner, 2018-11-16 14:44
Messages (6)
msg330005 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-16 14:44
The C code of Python initialization uses _PyCoreConfig and _PyMainInterpreterConfig which co-exist and more or less redundant. For example, both structures have "argv": wchar_** for _PyCoreConfig, PyObject* (list of str) for _PyMainInterpreterConfig.

I propose to put _PyCoreConfig inside _PyMainInterpreterConfig and move wchar_t* strings into a new _PyPreConfig structure.

The main idea is that _PyPreConfig and wchar_* type are only used for early Python initialization, but once Python is initialization, the configuration should be provided as Python objects.

I didn't check all configuration variables to see if it's doable or not, but I wrote a proof-of-concept pull request to show my idea with concrete (and working!) code.

See attached PR.
msg330039 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-11-18 10:04
I like where you're going with this, but would be willing to write an update to PEP 432 to sketch out in advance what you now think the end state is going to look like?

Merging the general structure of the draft PEP 432 implementation to make it possible to start migrating settings and see what's viable in practice has pretty much worked out as hoped, but we've diverged far enough from that structure now that we can't credibly claim to be working towards the current PEP draft as the new multi stage initialisation API.

Changing the proposal (and adding yourself as a co-author) is fine - that was the whole point of enabling initial development as a private API in the first place.
msg330069 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-18 22:48
> I like where you're going with this, but would be willing to write an update to PEP 432 to sketch out in advance what you now think the end state is going to look like?

Sadly, I'm unable to design in advance what will be the final state.

Python initialization is a giant beast, full of traps, with many practical issues.

I'm moving slowly, step by step.

For example, this issue "only" move wchar_t* out of _PyCoreConfig, but Eric Snow told that me that he (or you, Nick, I don't recall) would prefer to not use "Unicode" during the very first initialization stage. wchar_t* is already Unicode. I'm unable to see yet how to have 3 stages:

1) no unicode
2) C structures, wchar_t*
3) Python objects

Currently, (1)+(2) is _PyCoreConfig and (3) is _PyMainInterpreterConfig.

I prefer to work directly on the code to make sure to have a working implementation, than working on paper but don't know if it's possible to implement it :-)

One issue is that it requires more steps, but from my point of view we better control the risk since it's possible to move back if we make a mistake in a small change.
msg330252 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-11-22 12:53
I didn't know what was possible when I wrote PEP 432 either - instead, I wrote down an initial concept for what I *wanted*, and then started exploring the code to find out the barriers to achieving that.

We know enough now to know that original design concept isn't technically feasible, but that's OK - the general idea was just to get to a point where the startup code is better tested, easier to maintain, and easier to control in an embedding application, and everything outside that is negotiable.

The problem with the purely bottom-up approach is that we may end up with something that's better tested and easier to maintain, but find out that it hasn't actually helped us get to a point where we can make the interpreter easier for embedding applications to manage.

As far as Unicode goes, it isn't Unicode as a concept that's problematic, it's specifically the CPython Unicode type: that needs hash randomisation configured, and that means we need to have already processed the input settings that can affect the hash seed. And unlike UTF-8 mode, where there's a comparatively limited set of strings to recreate with a different decoding step, there's no escape hatch to let you cleanly recreate all previously created string objects with a different basis for their hash.
msg330263 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-22 14:35
Hum, my split is incomplete. From a high level point of view, the initialization should be done in these steps:

1) select memory allocator, config made of C char* (bytes) and int types
2) select encodings, add wchar_t* (Unicode) strings to the config
3) compute the "path configuration" (used to initialize importlib and sys.path)
4) apply the config to Python

Step (3) should be optional. Currently, the path configuration can be set in _PyCoreConfig to avoid the need "calculate" it (operation which access the filesystem).

Technically, we may have to use wchar_t* in (1), since Windows uses wmain() which gets argv as wchar_t** (and environment variables as wchar_t* as well).
msg331987 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-12-17 11:28
When I looked again at this issue, I'm not sure how what should be done, what is the proper design, what should stay after Python initialization, etc. I prefer to abandon this change and maybe retry to write it later.

I have a more advanced version in this branch of my fork:
Date User Action Args
2022-04-11 14:59:08adminsetgithub: 79447
2018-12-17 11:28:01vstinnersetstatus: open -> closed
resolution: out of date
messages: + msg331987

stage: patch review -> resolved
2018-11-22 14:35:20vstinnersetmessages: + msg330263
2018-11-22 12:53:44ncoghlansetmessages: + msg330252
2018-11-18 22:48:20vstinnersetmessages: + msg330069
2018-11-18 10:04:16ncoghlansetnosy: + lukasz.langa
messages: + msg330039
2018-11-16 14:44:40vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request9823
2018-11-16 14:44:06vstinnercreate