classification
Title: Use "Low-fragmentation Heap" memory allocator on Windows
Type: performance Stage:
Components: Windows Versions: Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: haypo, paul.moore, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2016-01-31 17:55 by haypo, last changed 2017-07-03 19:27 by steve.dower.

Messages (7)
msg259293 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-01-31 17:55
Python has a memory allocator optimized for allocations <= 512 bytes: PyObject_Malloc(). It was discussed to replace it by the native "Low-fragmentation Heap" memory allocator on Windows.

I'm not aware of anyone who tried that. I would nice to try, especially to run benchmarks.

See also the issue #26249: "Change PyMem_Malloc to use PyObject_Malloc allocator?".
msg259294 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-01-31 17:56
"Low-fragmentation Heap":
https://msdn.microsoft.com/en-us/library/windows/desktop/aa366750%28v=vs.85%29.aspx
msg259296 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2016-01-31 17:57
The issue #19246 "high fragmentation of the memory heap on Windows" was rejected but discussed the Windows Low Fragmented Heap.
msg297106 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-06-28 01:12
Is there anyway interested to experiment to write such change and run benchmarks with it?
msg297209 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2017-06-28 19:11
We tried it at one point, but it made very little difference because we don't use the Windows heap for most allocations. IIRC, replacing Python's optimised allocator with the LFH was a slight performance regression, but I'm not sure the benchmarks were reliable enough back then to be trusted. I'm also not sure what optimisations have been performed in Windows 8/10.

Since the LFH is the default though, it really should just be a case of replacing Py_Malloc with a simple HeapAlloc shim and testing it. The APIs are nearly the same (the result of GetProcessHeap() will be stable for the lifetime of the process, and there's little value in creating specific heaps unless you intend to destroy it rather than free each allocation individually).
msg297594 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-07-03 14:24
Steve: "We tried it at one point, but it made very little difference (...)"

Ok. Can I close the issue?
msg297610 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2017-07-03 19:27
I wouldn't be opposed to seeing it tried again, but I have no strong opinion. I don't think this is a major performance bottleneck right now.
History
Date User Action Args
2017-07-03 19:27:50steve.dowersetmessages: + msg297610
2017-07-03 14:24:01hayposetmessages: + msg297594
2017-06-28 19:11:58steve.dowersetmessages: + msg297209
2017-06-28 01:12:04hayposetmessages: + msg297106
2016-01-31 17:57:54hayposetmessages: + msg259296
2016-01-31 17:56:10hayposetmessages: + msg259294
2016-01-31 17:55:40haypocreate