classification
Title: Internal C API: pass the memory allocator in a context
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.8
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, ncoghlan, vstinner
Priority: normal Keywords: patch

Created on 2018-11-16 14:25 by vstinner, last changed 2018-11-22 16:31 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 10574 closed vstinner, 2018-11-16 14:26
Messages (4)
msg330001 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-16 14:25
Currently, the Python initialization code base is full of:

static void
pymain_clear_cmdline(_PyMain *pymain, _PyCmdline *cmdline)
{
    PyMemAllocatorEx old_alloc;
    _PyMem_SetDefaultAllocator(PYMEM_DOMAIN_RAW, &old_alloc);
    ...
    PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &old_alloc);
}

It's a hack that I wrote when I worked on the Python initialization code, to make sure that we use the same memory allocator to allocate and release memory. The problem is that Python allocates memory at startup, then the memory allocator can be changed in Py_Initialize() (ex: by PYTHONMALLOC environment variable), but at exit, we need to reuse the same memory allocator that we used to allocate memory early.

Instead of "forcing" the memory allocator to "default allocator", I propose to add a new "context" structure which contains the memory allocator (PyMemAllocatorEx structure). Example:

typedef struct {
    /* Enable UTF-8 mode?
       Set by -X utf8 command line option and PYTHONUTF8 environment variable.
       If set to -1 (default), inherit Py_UTF8Mode value. */
    int utf8_mode;

    PyMemAllocatorEx raw_alloc;
} _PyConfigCtx;

The context should be passed to any function allocating directly or indirectly memory.

Py_Main(), "core config" and "path config" are good target for these changes. These functions are used during Python initialization.

I know that Eric Snow would like to pass a "context" to functions of the C API. I don't want to make a giant change of the C API. I only want to modify the internal C API, and only modifiy Python internal. The context that I propose is designed for early Python initialization, and fix the practical problem of memory allocator.

Later we can discussed how _PyRuntime and the pointer to the current interpreter can be added into this context, or if a new context should be created.

Attached PR is a proof-of-concept of my idea, but the giant PR might be splitted into smaller changes to more incremental changes.
msg330015 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-17 00:24
Passing the memory allocator is my first goal.

My second goal is to cleanup the code reading the configuration, pymain_read_conf() in Modules/main.c:

        int utf8_mode = config->ctx.utf8_mode;
        int encoding_changed = 0;

        (...)

        /* bpo-34207: Py_DecodeLocale() and Py_EncodeLocale() depend
           on Py_UTF8Mode and Py_LegacyWindowsFSEncodingFlag. */
        Py_UTF8Mode = config->ctx.utf8_mode;
#ifdef MS_WINDOWS
        Py_LegacyWindowsFSEncodingFlag = config->legacy_windows_fs_encoding;
#endif

        (...)

        /* Reset the configuration before reading again the configuration,
           just keep UTF-8 Mode value. */
        int new_utf8_mode = config->ctx.utf8_mode;
        int new_coerce_c_locale = config->coerce_c_locale;
        if (_PyCoreConfig_Copy(config, &save_config) < 0) {
            pymain->err = _Py_INIT_NO_MEMORY();
            goto done;
        }
        pymain_clear_cmdline(pymain, cmdline);
        config->ctx.utf8_mode = new_utf8_mode;
        config->coerce_c_locale = new_coerce_c_locale;

        /* The encoding changed: read again the configuration
           with the new encoding */


My main concern is: "bpo-34207: Py_DecodeLocale() and Py_EncodeLocale() depend on Py_UTF8Mode and Py_LegacyWindowsFSEncodingFlag". Python initialization code should depend on global variables.
msg330041 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-11-18 10:16
I think the idea makes sense, but find the proposed name potentially confusing for two reasons:

1. It isn't only about configuration, it's about interpreter initialisation state in general
2. We use "context" for several other purposes already (most notably with statement context managers and async contexts)

Given the proposed "_PyPreConfig" construct, how would you feel about calling this "_PyPreInitState"? The idea there being that like PreConfig, the PreInitState would be discarded once the interpreter was fully initialised - the PreInitState would only be needed to handle cleanup *while* initialising (either of temporary values, or in the event of initialisation failing and needing to undo previous allocations).

My comment from #32566 also applies here: PEP 432 is now lagging so far behind the state of the private API implementation that it really needs an update to better reflect where you're wanting to take the public initialisation API following these changes.
msg330269 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-22 16:31
Honestly, I'm not convinced that the change is worth it. It looks like a giant complication for a very specific use case, whereas the current _PyMem_SetDefaultAllocator() solution "works". I close the issue.

Moreover, bpo-35266 conflicts with issue, and bpo-35266 looks more important than this issue, since it impacts the future public API and remove redundancy.
History
Date User Action Args
2018-11-22 16:31:12vstinnersetstatus: open -> closed
resolution: rejected
stage: patch review -> resolved
2018-11-22 16:31:05vstinnersetmessages: + msg330269
2018-11-18 10:16:30ncoghlansetmessages: + msg330041
2018-11-17 00:24:03vstinnersetmessages: + msg330015
2018-11-16 14:26:33vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request9822
2018-11-16 14:25:45vstinnercreate