classification
Title: API for setting the memory allocator used by Python
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.4
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Rhamphoryncus, amaury.forgeotdarc, barry, gregory.p.smith, haypo, jlaurila, jszakmeister, kristjan.jonsson, ncoghlan, neilo, pitrou, pjmcnerney, rhettinger, tlesher
Priority: normal Keywords: patch

Created on 2008-07-09 19:48 by jlaurila, last changed 2013-03-11 11:30 by kristjan.jonsson.

Files
File name Uploaded Description Edit
py_setallocators.patch haypo, 2013-03-06 12:32 review
Messages (17)
msg69482 - (view) Author: Jukka Laurila (jlaurila) Date: 2008-07-09 19:48
Currently Python always uses the C library malloc/realloc/free as the
underlying mechanism for requesting memory from the OS, but especially
on memory-limited platforms it is often desirable to be able to override
the allocator and to redirect all Python's allocations to use a special
heap. This will make it possible to free memory back to the operating
system without restarting the process, and to reduce fragmentation by
separating Python's allocations from the rest of the program.

The proposal is to make it possible to set the allocator used by the
Python interpreter by calling the following function before Py_Initialize():

void Py_SetAllocator(void* (*alloc)(size_t), void* (*realloc)(void*,
size_t), void (*free)(void*))

Direct function calls to malloc/realloc/free in obmalloc.c must be
replaced with calls through the function pointers set through this
function. By default these would of course point to the C stdlib
malloc/realloc/free.
msg69483 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2008-07-09 19:55
Is registering pointers to functions really necessary, or would defining
macros work as well? From a performance perspective I would like to
avoid having a pointer indirection step every time malloc/realloc/free
is called.

I guess my question becomes, Jukka, is this more for alternative
implementations of Python where changes to source are already expected,
or for apps that embed Python where a change of malloc/realloc/free
varies from app to app that dynamically loads Python?
msg69484 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2008-07-09 20:06
How would this allow you to free all memory?  The interpreter will still
reference it, so you'd have to have called Py_Finalize already, and
promise not to call Py_Initialize afterwords.  This further supposes the
process will live a long time after killing off the interpreter, but in
that case I recommend putting python in a child process instead.
msg69494 - (view) Author: Jukka Laurila (jlaurila) Date: 2008-07-10 08:05
Brett, the ability to define the allocator dynamically at runtime could
be a compile time option, turned on by default only on small memory
platforms. On most platforms you can live with plain old malloc and may
want to avoid the indirection. If no other platform is interested in
this, we can just make it a Symbian-specific extension but I wanted to
see if there's general interest in this.

The application would control the lifecycle of the Python heap, and this
seemed like the most natural way for the application to tell the
interpreter which heap instance to use.

Adam, the cleanup would work by freeing the entire heap used by Python
after calling Py_Finalize. In the old PyS60 code we made Python 2.2.2
clean itself completely by freeing the Python-specific heap and making
sure all pointers to heap-allocated items are properly reinitialized.

Yes, there are various static pointers that are initially set to NULL,
initialized to point at things on the heap and not reset to NULL at
Py_Finalize, and these are currently an obstacle to calling
Py_Initialize again. I'm considering submitting a separate ticket about
that since it seems like the ability to free the heap combined with the
ability to reinitialize the static pointers could together make full
cleanup possible.
msg69497 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2008-07-10 09:59
Given where we are in the release cycle, I've bumped the target releases
to 2.7/3.1. So Symbian are probably going to have to do something
port-specific anyway in order to get 2.6/3.0 up and running.

And in terms of hooking into this kind of thing, some simple macros that
can be overriden in pyport.h (as Brett suggested) may be a better idea
than baking any specific approach into the core interpreter.
msg69499 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2008-07-10 10:12
I think it is reasonable to get a macro definition change into 2.6.
The OP's request is essential for his application (running Python
on Nokia phones) and it would be a loss to wait two years for this.
Also, his request for a macro will enable another important piece
of functionality -- allowing a build to intercept and instrument all
calls to the memory allocator.

Barry, can you rule on whether to keep this open for consideration in 
2.6.   It seems daft to postpone this discussion indefinitely.  If we 
can agree to a simple, non-invasive solution while there is still yet 
another beta, then it makes sense to proceed.
msg69511 - (view) Author: Adam Olsen (Rhamphoryncus) Date: 2008-07-10 16:57
Basically you just want to kick the malloc implementation into doing
some housekeeping, freeing its caches?  I'm kinda surprised you don't
add the hook directly to your libc's malloc.

IMO, there's no use-case for this until Py_Finalize can completely tear
down the interpreter, which requires a lot of special work (killing(!)
daemon threads, unloading C modules, etc), and nobody intends to do that
at this point.

The practical alternative, as I said, is to run python in a subprocess.
 Let the OS clean up after us.
msg78995 - (view) Author: Neil Richardson (neilo) Date: 2009-01-03 19:55
I'll be in agreement here. I integrated Python into a game engine not 
too long ago, and had to a do a fair chunk of work to isolate Python 
into it's own heap - given that fragmentation on low memory systems can 
be a bit of a killer. Would also make future upgrades a heck of a lot 
easier too, as there'd be no need to do a search for all references and 
carefully replace them all.
msg79309 - (view) Author: Jukka Laurila (jlaurila) Date: 2009-01-07 09:18
Brett is right. Macroing the memory allocator is a better choice than
forcing indirection on all platforms. We did this on Python for S60,
using the macros PyCore_{MALLOC,REALLOC,FREE}_FUNC for interpreter's
allocations, and then redirected those to a mechanism that allows to set
the allocator at runtime. 

Sorry we don't have a clean patch at present for this change only, but
in case anyone's interested the full source is at
https://garage.maemo.org/frs/?group_id=854
msg91957 - (view) Author: PJ McNerney (pjmcnerney) Date: 2009-08-25 19:33
Has the ability to set the memory allocator been added to Python 2.7/3.1?

Thanks,
PJ
msg142981 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-08-25 17:16
All this needs is a patch.
Note that there are some places where we call malloc()/free() without going through our abstraction API. This is not in allocation-heavy paths, though.
msg183587 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-03-06 12:32
I attached a patch that I wrote for Wyplay: py_setallocators.patch. The patch adds two functions:

PyAPI_FUNC(int) Py_GetAllocators(
    char api,
    void* (**malloc_p) (size_t),
    void* (**realloc_p) (void*, size_t),
    void (**free_p) (void*)
    );

PyAPI_FUNC(int) Py_SetAllocators(
    char api,
    void* (*malloc) (size_t),
    void* (*realloc) (void*, size_t),
    void (*free) (void*)
    );

Where api is one of these values:

 - PY_ALLOC_SYSTEM_API: the system API (malloc, realloc, free)
 - PY_ALLOC_MEM_API: the PyMem_Malloc() API
 - PY_ALLOC_OBJECT_API: the PyObject_Malloc() API

These functions are used by the pytracemalloc project to hook PyMem_Malloc() and PyObject_Malloc() API. pytracemalloc traces all Python memory allocations to compute statistics per Python file.
https://pypi.python.org/pypi/pytracemalloc

Wyplay is also using Py_SetAllocators() internally to replace completly system allocators *before* Python is started. We have another private patch on Python adding a function. This function sets its own memory allocators, it is called before the start of Python thanks to an "__attribute__((constructor))" attribute.

--

If you use Py_SetAllocators() to replace completly a memory allocator (any memory allocation API), you have to do it before the first Python memory allocation (before Py_Main()) *or* your memory allocator must be able to recognize if a pointer was not allocated by him and pass the operation (realloc or free) to the previous memory allocator.

For example, PyObject_Free() is able to recognize that a pointer is part of its memory pool, or fallback to the system allocator (extract of the original code):

    if (Py_ADDRESS_IN_RANGE(p, pool)) {
        ...
        return;
    }
    free(p);

--

If you use Py_SetAllocators() to hook memory allocators (do something before and/or after calling the previous function, *without* touching the pointer nor the size), you can do it anytime.

--

I didn't run a benchmark yet to measure the overhead of the patch on Python performances.

New functions are not documented nor tested yet. If we want to test these new functions, we can write a simple hook tracing calls to the memory allocators and call the memory allocator.
msg183590 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2013-03-06 12:38
To be exhaustive, another patch should be developed to replace all calls for malloc/realloc/free by PyMem_Malloc/PyMem_Realloc/PyMem_Free. PyObject_Malloc() is still using mmap() or malloc() internally for example.

Other examples of functions calling malloc/realloc/free directly: _PySequence_BytesToCharpArray(), block_new() (of pyarena.c), find_key() (of thread.c), PyInterpreterState_New(), win32_wchdir(), posix_getcwd(), Py_Main(), etc.
msg183591 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2013-03-06 13:41
Some customizable memory allocators I know have an extra parameter "void *opaque" that is passed to all functions:

- in zlib: zalloc and zfree: http://www.zlib.net/manual.html#Usage
- same thing for bz2.
- lzma's ISzAlloc: http://www.asawicki.info/news_1368_lzma_sdk_-_how_to_use.html
- Oracle's OCI: http://docs.oracle.com/cd/B10501_01/appdev.920/a96584/oci15re4.htm

OTOH, expat, libxml, libmpdec don't have this extra parameter.
msg183947 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2013-03-11 10:19
At ccp we have something similar.  We are embedding python in the UnrealEngine on the PS3 and need to get everything through their allocators.  For the purpose of flexibility, we added an api similar to the OPs, but more flexible:

/* Support for custom allocators */
typedef void *(*PyCCP_Malloc_t)(size_t size, void *arg, const char *file, int line, const char *msg);
typedef void *(*PyCCP_Realloc_t)(void *ptr, size_t size, void *arg, const char *file, int line, const char *msg);
typedef void (*PyCCP_Free_t)(void *ptr, void *arg, const char *file, int line, const char *msg);
typedef size_t (*PyCCP_Msize_t)(void *ptr, void *arg);
typedef struct PyCCP_CustomAllocator_t
{
    PyCCP_Malloc_t  pMalloc;
    PyCCP_Realloc_t pRealloc;
    PyCCP_Free_t    pFree;
    PyCCP_Msize_t   pMsize;    /* can be NULL, or return -1 if no size info is avail. */
    void            *arg;      /* opaque argument for the functions */
} PyCCP_CustomAllocator_t;

/* To set an allocator!  use 0 for the regular allocator, 1 for the block allocator.
 * pass a null pointer to reset to internal default
 */
PyAPI_FUNC(void) PyCCP_SetAllocator(int which, const PyCCP_CustomAllocator_t *);

For a module to install itself as a "hook" at runtime, this approach can be extended by querying the current allocator, so that such a hook can the delegate the previous calls.

The "block" allocator here, is intended as the underlying allocator to be used by obmalloc.c.  Depending on platforms, this can then allocate aligned virtual memory directly, which is more efficient than layering that on-top of a malloc-like allocator.

There are areas in cPython that use malloc() directly.  Those are actually not needed in all cases, but to cope with them we change them all to new RAW api calls (using preprocessor macros).
Essentially, malloc() maps to PyCCP_RawMalloc() or PyMem_MALLOC_INNER() (both local additions) based on whether the particular site using malloc() requires truly gil free malloc or not.

For this reason, the custom allocators mentioned canot be assumed to be called with the GIL.  However, it is easily possible to extend the system above so that there is a GIL and non-GIL version for the 'regular' allocator.

I'll put details of the stuff we have done for EVE Online / Dust 514 on my blog.  It is this, but much much more too.

Hopefully we can arrive at a way to abstract memory allocation away from Python in a flexible and extendible manner :)
msg183950 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2013-03-11 11:15
Note that I'm definitely open to including extra settings to set up custom allocators as part of Py_CoreConfig in PEP 432 (http://www.python.org/dev/peps/pep-0432/#pre-initialization-phase).

I don't really want to continue the tradition of additional PySet_* APIs with weird conditions on when they have to be called, though (trying to prevent more of that kind of organic growth in complexity is why I wrote PEP 432 in the first place)
msg183951 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2013-03-11 11:30
Absolutely.  Although there is a very useful scenario where this could be consided a run-time setting:

  # turboprofiler.py
  # Load up the memory hooker which will supply us with all the info
  import _turboprofiler
  _turboprofiler.hookup()

Perhaps people interested in memory optimizations and profiling could hook up at pycon?  It is the most common regular query I get from people in my organization:  How can I find out how python is using/leaking/wasting memory?
History
Date User Action Args
2013-03-11 11:30:20kristjan.jonssonsetmessages: + msg183951
2013-03-11 11:15:26ncoghlansetmessages: + msg183950
2013-03-11 10:20:00kristjan.jonssonsetmessages: + msg183947
2013-03-11 10:01:26kristjan.jonssonsetnosy: + kristjan.jonsson
2013-03-10 16:29:36gregory.p.smithsetnosy: + gregory.p.smith
2013-03-06 13:41:32amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg183591
2013-03-06 12:38:40hayposetmessages: + msg183590
2013-03-06 12:32:44hayposetversions: + Python 3.4, - Python 3.3
2013-03-06 12:32:32hayposetfiles: + py_setallocators.patch

nosy: + haypo
messages: + msg183587

keywords: + patch
2013-01-25 19:27:26brett.cannonsetnosy: - brett.cannon
2011-08-25 17:16:25pitrousetnosy: + pitrou

messages: + msg142981
versions: + Python 3.3, - Python 3.2
2010-08-09 18:35:18terry.reedysetversions: - Python 3.1, Python 2.7
2010-02-18 20:30:12barrysetassignee: barry ->
2009-10-01 02:49:52tleshersetnosy: + tlesher
2009-08-25 19:33:39pjmcnerneysetnosy: + pjmcnerney

messages: + msg91957
versions: + Python 3.2, - Python 2.6, Python 2.5, Python 3.0
2009-05-29 10:14:50jszakmeistersetnosy: + jszakmeister
2009-01-07 09:18:24jlaurilasetmessages: + msg79309
2009-01-03 19:55:31neilosetnosy: + neilo
messages: + msg78995
versions: + Python 2.6, Python 2.5, Python 3.0
2008-07-10 16:57:10Rhamphoryncussetmessages: + msg69511
2008-07-10 10:12:33rhettingersetassignee: barry
messages: + msg69499
nosy: + barry, rhettinger
2008-07-10 09:59:16ncoghlansetnosy: + ncoghlan
messages: + msg69497
versions: + Python 3.1, Python 2.7, - Python 2.6, Python 3.0
2008-07-10 08:05:35jlaurilasetmessages: + msg69494
2008-07-09 20:06:33Rhamphoryncussetnosy: + Rhamphoryncus
messages: + msg69484
2008-07-09 19:55:55brett.cannonsetnosy: + brett.cannon
messages: + msg69483
2008-07-09 19:48:52jlaurilacreate