Issue 21220: Enhance obmalloc allocation strategy

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/65419

classification

Title:	Enhance obmalloc allocation strategy
Type:	resource usage	Stage:
Components:	Interpreter Core	Versions:	Python 3.5

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	kristjan.jonsson, larry, neologix, pitrou, tim.peters, vstinner
Priority:	normal	Keywords:	patch

Created on 2014-04-14 18:23 by kristjan.jonsson, last changed 2022-04-11 14:58 by admin.

Files
File name	Uploaded	Description	Edit
obmalloc.patch	kristjan.jonsson, 2014-04-14 18:23		review
obmalloc.patch	kristjan.jonsson, 2014-04-15 15:45		review

Messages (12)
msg216156 - (view)	Author: Kristján Valur Jónsson (kristjan.jonsson) *	Date: 2014-04-14 18:23
A new allocation policy, "lowest address strategy" improves fragmentation of memory in obmalloc. pools with available memory are chosen by lowest address preference. This increases the likelihood that unused pools are released to their corresponding arenas. Arenas with available pools are similarly chosen by lowest address. This significantly helps fragmentation in programs with dynamic memory usage, e.g. long running programs. Initial tests also indicate some minor performance benefits of the pybench, probably due to better cache behaviour.
msg216316 - (view)	Author: Kristján Valur Jónsson (kristjan.jonsson) *	Date: 2014-04-15 15:45
Update patch with suggestions from Larry
msg216338 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-04-15 16:59
In Python 3, arenas are allocated using mmap(), so wherever the arena ends up in the address space shouldn't matter, should it? Also, the new strategy adds an O(n) component in the allocator.
msg216440 - (view)	Author: STINNER Victor (vstinner) *	Date: 2014-04-16 06:40
It was also discussed to replace pymalloc with Windows Low Fragementation Heap (LFH) allocator on Windows: http://bugs.python.org/issue13483#msg148605
msg216441 - (view)	Author: STINNER Victor (vstinner) *	Date: 2014-04-16 06:44
It would also be interesting to compare fragmentation and performances of Python with and without pymalloc, maybe with other heap allocators like FreeBSD jemalloc and Google TCMalloc.
msg216443 - (view)	Author: Charles-François Natali (neologix) *	Date: 2014-04-16 07:03
> In Python 3, arenas are allocated using mmap(), so wherever the arena ends up in the address space shouldn't matter, should it? Indeed, although the effect on cache locality isn't clear. Also, I don't think this solves the problem of having a single object allocated inside a high address arena preventing the heap from shrinking (which was the original reason for having the arenas allocated by mmap). Anyway, we can only go that far with reference counting (I mean that you'd need a proper moving garbage collector for this).
msg216529 - (view)	Author: Kristján Valur Jónsson (kristjan.jonsson) *	Date: 2014-04-16 18:01
Antoine: The location of the arenas when they're individually allocated with mmap does not matter, no, but preferring to keep low address ones reduces vmem fragmentation, since they end up being clustered together in memory. For the usable-arenas list, there is no extra O(n) because they were ordered anyway. the effect of ARENA_STRATEGY is minor, but it helps for it to be consistent with POOL_STRATEGY. The real win however is with POOL_STRATEGY. Fragmentation is dramatically reduced. This is demonstrated with the tools/scripts/memcrunch.py which you can use to experiment with it. Performance e.g. of unittests also goes up. The fact that there is a new O(n) sort operation when a pool becomes 'used' does not seem to matter for that. Victor: I've tested using windows LFH many times before, the python obmalloc generally is much faster than that. Annoying :). It is actually a very good allocator. The innovation here is the "lowest address strategy" which I have never seen before (it might be known, but then I'm not a CS) but is one that I have experimented with for often in the past. It is suprisingly effective. When there is memory churn, memory usage tends to migrate towards low addresses and free up memory. Go ahead, try the scripts and see what happens. The proof is in the pudding :)
msg216530 - (view)	Author: Kristján Valur Jónsson (kristjan.jonsson) *	Date: 2014-04-16 18:02
sorry, I meant of course "performance of pybench.py goes up"
msg216533 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2014-04-16 18:06
> sorry, I meant of course "performance of pybench.py goes up" pybench is really a very poor (extremely low-level) benchmark. It would be nice if you could test with the benchmark suite at http://hg.python.org/benchmarks
msg216544 - (view)	Author: Kristján Valur Jónsson (kristjan.jonsson) *	Date: 2014-04-16 18:21
Sure. I'm flying home from PyCon this afternoon. I´ll produce and tabulate data once I'm home on my workstation again.
msg217230 - (view)	Author: STINNER Victor (vstinner) *	Date: 2014-04-27 00:22
I spend some nights to try to understand the memory usage of the following Python script: https://bitbucket.org/haypo/misc/src/31bf03ace91db3998981ee56caf80f09c29991f5/memory/python_memleak.py?at=default It looks like the weird memory usage (aka "memory fragmentation"?) was fixed in Python 3.3. > This significantly helps fragmentation in programs with dynamic memory usage, e.g. long running programs. On which programs? The fragmentation of the memory depends a lot on how the program allocates memory. For example, if a program has no "temporary memory peak", it should not be a victim of the memory fragmentation. To measure the improvment of such memory allocator, more benchmarks (speed and fragmentation) should be run than a single test (memcruch.py included in the test) written to benchmark the allocator.
msg217353 - (view)	Author: Kristján Valur Jónsson (kristjan.jonsson) *	Date: 2014-04-28 09:42
>> This significantly helps fragmentation in programs with dynamic memory usage, e.g. long running programs. > On which programs? The fragmentation of the memory depends a lot on how the program allocates memory. For example, if a program has no "temporary memory peak", it should not be a victim of the memory fragmentation. "Long running programs." E.g. web servers and so on. Where there is a churn in memory usage. As objects are allocated and released with some sort of churn. New objects will be allocated from lower-address pages, where higher address pages are increasingly likely to be freed as no-longer used objects are released. This is the second best thing to a sliding-block allocator and is motivated by the same requirements that makes such an sliding-block allocator (such as pypy uses) desirable in the first place. > To measure the improvment of such memory allocator, more benchmarks (speed and fragmentation) should be run than a single test (memcruch.py included in the test) written to benchmark the allocator. Yes. Memcrunch was specifically written to simulate a case where objects are continuously created and released, such as might be expected in a server, with a peak in memory usage followed by lower memory usage, and to demonstrate that the pages allocated during the peak will be released later as memory churn causes memory usage to migrate toward lower addresses. However, following Antoine's advice I ran the Benchmarks testsuite and found an adverse effect in the n-body benchmark. That can have two causes: a) the insertion cost of the block when a block moves from 'full' to 'used'. This is a rare event and should be unimportant. I will instrument this for this test and see if it is really the reason b) Cache effects because a newly 'used' block is not immediately allocated from. Again, it happens rarely that a block is linked at the head so this shouldn't be significant. Because of this, this change isn't yet ready to be applied. If however I manage to change the policy so that memory usage becomes better while still preserving performance of the benchmark tests, I will report back :)

History
Date	User	Action	Args
2022-04-11 14:58:01	admin	set	github: 65419
2014-04-28 09:42:20	kristjan.jonsson	set	messages: + msg217353
2014-04-27 00:22:27	vstinner	set	messages: + msg217230
2014-04-16 18:21:49	kristjan.jonsson	set	messages: + msg216544
2014-04-16 18:06:08	pitrou	set	messages: + msg216533
2014-04-16 18:02:41	kristjan.jonsson	set	messages: + msg216530
2014-04-16 18:01:58	kristjan.jonsson	set	messages: + msg216529
2014-04-16 07:03:37	neologix	set	messages: + msg216443
2014-04-16 06:44:08	vstinner	set	messages: + msg216441
2014-04-16 06:40:18	vstinner	set	nosy: + vstinner messages: + msg216440
2014-04-15 16:59:54	pitrou	set	nosy: + tim.peters, neologix, pitrou messages: + msg216338
2014-04-15 15:46:02	kristjan.jonsson	set	files: + obmalloc.patch messages: + msg216316
2014-04-14 21:17:33	kristjan.jonsson	set	nosy: + larry
2014-04-14 18:23:06	kristjan.jonsson	create