Message 261345 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	A. Skrobov, christian.heimes, eryksun, paul.moore, rhettinger, serhiy.storchaka, steve.dower, tim.golden, vstinner, zach.ware
Date	2016-03-08.11:09:20
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1457435362.73.0.216877293338.issue26415@psf.upfronthosting.co.za>
In-reply-to

Content
> So, apparently, it's not the nodes themselves taking up a disproportionate amount of memory -- it's the heap getting so badly fragmented that 89% of its memory allocation is wasted. Yeah, the Python parser+compiler badly uses the memory allocator. See my "benchmark" for the memory allocator: python_memleak.py. The classical pattern of memory fragmentation is: * allocate a lot of small objects * allocate a few objects * allocate more small objects * free all small objects All objects must allocated on the heap, not mmap(). So the maximum size of a single object must be 128 KB (usual threshold used in malloc() to switch beetween the heap memory and mmap). We can try various hacks to reduce the fragmentation, but IMHO the only real fix is to use a different memory allocator for the compiler and then free everything allocated by the parser+compiler at once. We already have an "arena" memory allocator: Include/pyarena.h, Python/pyarena.c. It is already used by the parser+compiler, but it's only used for AST objects in practice. The parser uses the PyMem allocator API (ex: PyMem_Malloc).

> So, apparently, it's not the nodes themselves taking up a disproportionate amount of memory -- it's the heap getting so badly fragmented that 89% of its memory allocation is wasted.

Yeah, the Python parser+compiler badly uses the memory allocator. See my "benchmark" for the memory allocator: python_memleak.py.

The classical pattern of memory fragmentation is:

* allocate a lot of small objects
* allocate a few objects
* allocate more small objects
* free *all* small objects

All objects must allocated on the heap, not mmap(). So the maximum size of a single object must be 128 KB (usual threshold used in malloc() to switch beetween the heap memory and mmap).

We can try various hacks to reduce the fragmentation, but IMHO the only real fix is to use a different memory allocator for the compiler and then free everything allocated by the parser+compiler at once.

We already have an "arena" memory allocator: Include/pyarena.h, Python/pyarena.c. It is already used by the parser+compiler, but it's only used for AST objects in practice. The parser uses the PyMem allocator API (ex: PyMem_Malloc).

History
Date	User	Action	Args
2016-03-08 11:09:22	vstinner	set	recipients: + vstinner, rhettinger, paul.moore, christian.heimes, tim.golden, zach.ware, serhiy.storchaka, eryksun, steve.dower, A. Skrobov
2016-03-08 11:09:22	vstinner	set	messageid: <1457435362.73.0.216877293338.issue26415@psf.upfronthosting.co.za>
2016-03-08 11:09:22	vstinner	link	issue26415 messages
2016-03-08 11:09:22	vstinner	create