Author vstinner
Recipients A. Skrobov, christian.heimes, eryksun, paul.moore, rhettinger, serhiy.storchaka, steve.dower, tim.golden, vstinner, zach.ware
Date 2016-03-08.11:09:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1457435362.73.0.216877293338.issue26415@psf.upfronthosting.co.za>
In-reply-to
Content
> So, apparently, it's not the nodes themselves taking up a disproportionate amount of memory -- it's the heap getting so badly fragmented that 89% of its memory allocation is wasted.

Yeah, the Python parser+compiler badly uses the memory allocator. See my "benchmark" for the memory allocator: python_memleak.py.

The classical pattern of memory fragmentation is:

* allocate a lot of small objects
* allocate a few objects
* allocate more small objects
* free *all* small objects

All objects must allocated on the heap, not mmap(). So the maximum size of a single object must be 128 KB (usual threshold used in malloc() to switch beetween the heap memory and mmap).

We can try various hacks to reduce the fragmentation, but IMHO the only real fix is to use a different memory allocator for the compiler and then free everything allocated by the parser+compiler at once.

We already have an "arena" memory allocator: Include/pyarena.h, Python/pyarena.c. It is already used by the parser+compiler, but it's only used for AST objects in practice. The parser uses the PyMem allocator API (ex: PyMem_Malloc).
History
Date User Action Args
2016-03-08 11:09:22vstinnersetrecipients: + vstinner, rhettinger, paul.moore, christian.heimes, tim.golden, zach.ware, serhiy.storchaka, eryksun, steve.dower, A. Skrobov
2016-03-08 11:09:22vstinnersetmessageid: <1457435362.73.0.216877293338.issue26415@psf.upfronthosting.co.za>
2016-03-08 11:09:22vstinnerlinkissue26415 messages
2016-03-08 11:09:22vstinnercreate