classification
Title: PyXXX_ClearFreeList for dict, set, and list
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: christian.heimes, gregory.p.smith, gvanrossum, haypo, loewis, matthiastroffaes, pitrou, python-dev, r.david.murray, rhettinger
Priority: normal Keywords: patch

Created on 2009-08-13 16:45 by matthiastroffaes, last changed 2011-12-16 10:26 by pitrou. This issue is now closed.

Files
File name Uploaded Description Edit
py3k-clearfreelist-dict_set_list.patch matthiastroffaes, 2009-08-13 16:45 clearfreelist for dict, set, and list (against py3k, revision 74419) review
py3k-clearfreelist-gc_collect.patch matthiastroffaes, 2009-08-13 16:49 also call the new PyXXX_ClearFreeList functions during gc.collect() review
py3k-freelist_test.py matthiastroffaes, 2009-08-13 16:57 script to test memory usage after gc.collect()
py3k-tp_free_list-dict.patch matthiastroffaes, 2009-08-14 16:23 tp_free_list for dict implementation, including call from gc.collect()
py3k-clearfreelist-time_gc_collect.patch matthiastroffaes, 2009-08-17 11:43 patch to time effect of freeing freelists during garbage collection
py3k-rev81387-clearfreelist-dict_set_list.patch matthiastroffaes, 2010-05-21 12:11 clearfreelist for dict, set, and list (against py3k, revision 81387)
py3k-rev81387-clearfreelist-gc_collect.patch matthiastroffaes, 2010-05-21 12:12 also call the new PyXXX_ClearFreeList functions during gc.collect() (against py3k, revision 81387)
py3k-rev81387-clearfreelist-time_gc_collect.patch matthiastroffaes, 2010-05-21 12:12 patch to time effect of freeing freelists during garbage collection (against py3k, revision 81387)
py3k-04082011-clearfreelist-dict_set_list.patch matthiastroffaes, 2011-08-04 13:44
Messages (23)
msg91520 - (view) Author: Matthias Troffaes (matthiastroffaes) Date: 2009-08-13 16:45
The Python C API provides PyXXX_ClearFreeList functions to allow the
float, int, etc... freelists to be freed, potentially releasing memory
to the OS earlier. Currently, there is no such API for the dict, set,
and list freelists.

The attached patch adds PyXXX_ClearFreeList functions to the C API, so
the dict, set, and list freelists can be freed as well.
msg91521 - (view) Author: Matthias Troffaes (matthiastroffaes) Date: 2009-08-13 16:49
I attach a second patch which also calls the new PyXXX_ClearFreeList
functions on garbage collection, during gc.collect().
msg91523 - (view) Author: Matthias Troffaes (matthiastroffaes) Date: 2009-08-13 16:57
I'm also attaching a test script to check the effect of the two patches
on gc.collect(). If many objects are allocated, space savings appear to
be relevant.



Before applying the patch (debug build on linux 64 bit):

Memory used (begin): 121Mb

memtest 2000000 int
====================
Memory used (peak): 297Mb
Memory used (end): 122Mb
Unfreed memory: 1Mb

memtest 2000000 str
====================
Memory used (peak): 455Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 float
====================
Memory used (peak): 236Mb
Memory used (end): 127Mb
Unfreed memory: 6Mb

memtest 2000000 int
====================
Memory used (peak): 313Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 Test
====================
Memory used (peak): 372Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 Test2
====================
Memory used (peak): 361Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 _tuple
====================
Memory used (peak): 529Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 _set
====================
Memory used (peak): 846Mb
Memory used (end): 765Mb
Unfreed memory: 644Mb

memtest 2000000 _dict
====================
Memory used (peak): 1241Mb
Memory used (end): 1241Mb
Unfreed memory: 1120Mb

memtest 2000000 Test3
====================
Memory used (peak): 1241Mb
Memory used (end): 765Mb
Unfreed memory: 644Mb
[40720 refs]



After applying the patch (same build system):

Memory used (begin): 121Mb

memtest 2000000 int
====================
Memory used (peak): 298Mb
Memory used (end): 121Mb
Unfreed memory: 0Mb

memtest 2000000 str
====================
Memory used (peak): 455Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 float
====================
Memory used (peak): 236Mb
Memory used (end): 127Mb
Unfreed memory: 6Mb

memtest 2000000 int
====================
Memory used (peak): 312Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 Test
====================
Memory used (peak): 374Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 Test2
====================
Memory used (peak): 361Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 _tuple
====================
Memory used (peak): 528Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 _set
====================
Memory used (peak): 846Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 _dict
====================
Memory used (peak): 1240Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 Test3
====================
Memory used (peak): 999Mb
Memory used (end): 123Mb
Unfreed memory: 2Mb
[40740 refs]
msg91526 - (view) Author: Skip Montanaro (skip.montanaro) * Date: 2009-08-13 18:00
Instead of expanding the C API for each type which supports a free
list perhaps there should be a single call, say, PyObject_ClearFreeList,
which takes a pointer to the appropriate type object as an argument.
PyTypeObject can then grow a tp_free_list slot through which the
function calls a type-specific free list clearing function.

Skip
msg91558 - (view) Author: Matthias Troffaes (matthiastroffaes) Date: 2009-08-14 16:23
Thanks for the feedback!

Attaching a new patch which implements tp_free_list slot as suggested -
I hope I did it correctly. I've only implemented the new slot for dict
so far, but I'm happy to tp_free_list-ify the other freelist types as
well, in a future patch, if this gets the green light.

Description of the patch:

* added tp_free_list slot to PyTypeObject (definitely for review: is the
location of tp_free_list right after tp_free sensible?)
* added PyType_ClearFreeList(PyTypeObject *) to C API, which calls the
tp_free_list function if not NULL
* inserted the new slot where necessary (e.g. in PyGen_Type) to sync
type definitions with the updated PyTypeObject
* created dict_free_list function and added it to PyDict_Type
* call PyType_ClearFreeList(&PyDict_Type) from gc.collect()
msg91561 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-08-14 17:04
Guido, do you want a slot assigned for this?
msg91563 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2009-08-14 17:19
A slot in every type object for this purpose seems wasteful; the large
majority of types won't have a free list.  (Remember, each user-defined
class allocates a full type structure on the heap.)
msg91565 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-08-14 18:19
Marking the PyXXX_ClearFreeList version as approved.
msg91593 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-08-15 05:37
Does anyone here know why GC calls the free_xxx functions?  ISTM, they
cannot be involved in cycles.  Free lists are kept by container objects
to speed-up allocation.  Having GC call the free_xxx just slows down the
GC process and all the subsequent set/list/tuple allocations until the
free lists are built-up again.  IMO, the free_xxx functions should only
be called during atexit or by an explicit call from the user perhaps,
sys.clear_freelists() or somesuch.
msg91613 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-08-15 17:35
The reason is that users expect gc.collect() to make its best to
diminish memory use. Clearing free lists can allow deallocting some
arenas which otherwise would still contain some used memory blocks. As
the comment says:

/* Clear all free lists
 * All free lists are cleared during the collection of the highest
generation.
 * Allocated items in the free list may keep a pymalloc arena occupied.
 * Clearing the free lists may give back memory to the OS earlier.
 */

Full collections (collections of the oldest generation) are rather rare,
so the performance impact is probably minimal, and it helps reduce
memory fragmentation from time to time (which can produce significant
effect as shown in Matthias' example).
msg91618 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2009-08-15 22:06
> The reason is that users expect gc.collect() to make 
> its best to diminish memory use.

I thought GC was expected to eliminate reference cycles.
Perhaps there ought to be a separate API, such as 
sys.clear_freelists(), to eliminate other memory use when 
needed.  Putting this in GC seems like feature creep and
has negative performance implications (long running programs
will likely find an immediate need to reallocate the freed
members).


> Allocated items in the free list may keep a pymalloc arena occupied.
> * Clearing the free lists may give back memory to the OS earlier.

These are both good reasons to expose the functionality somewhere.
msg91623 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-08-15 22:50
Le samedi 15 août 2009 à 22:06 +0000, Raymond Hettinger a écrit :
> Raymond Hettinger <rhettinger@users.sourceforge.net> added the comment:
> 
> > The reason is that users expect gc.collect() to make 
> > its best to diminish memory use.
> 
> I thought GC was expected to eliminate reference cycles.

Of course, but it's also the de facto API when wanting to reclaim
memory. The face that a single function call is sufficient is a good
thing in itself.

> Perhaps there ought to be a separate API, such as 
> sys.clear_freelists(), to eliminate other memory use when 
> needed.  Putting this in GC seems like feature creep and
> has negative performance implications (long running programs
> will likely find an immediate need to reallocate the freed
> members).

Performance claims should be substanstiated with actual numbers,
otherwise it's too easy to clutter the API with gratuitous
complications. The impact of reallocating may be negligible, or it might
even be positive if it improves cache locality.
msg91630 - (view) Author: Skip Montanaro (skip.montanaro) * Date: 2009-08-16 14:28
>> I thought GC was expected to eliminate reference cycles.

    Antoine> Of course, but it's also the de facto API when wanting to
    Antoine> reclaim memory.

When did that happen?  I agree with Raymond.  The cyclic gc should just
reclaim cycles.

Skip
msg91631 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-08-16 14:43
> When did that happen?  I agree with Raymond.  The cyclic gc should just
> reclaim cycles.

People don't care about referential cycles, they care about freeing some
memory (if memory was available in infinite quantities nobody would care
about breaking cycles). That's how the API is used most of the time,
IMO. And that's why measurements of the usefulness of calling
gc.collect() are usually done in megabytes, not in number of
references :-).

So, while I agree that it sounds bizarre for the GC to do other
memory-related tasks, it's also quite practical. Besides, the GC already
has a heuristic for *when* to cleanup memory, and it makes sense to
reuse this heuristic for other memory cleanup tasks, rather than to
invent another heuristic or put the burden of the task on the user (who
usually won't even know what those freelists are).
msg91633 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-08-16 15:13
The change was originally made in r60797, « Implemented Martin's
suggestion to clear the free lists during the garbage collection of the
highest generation ».
msg91634 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-08-16 15:18
FWIW, I agree with Antoine here.  I think user expectation is that when
"garbage" is collected, at least some freed memory will be returned to
the operating system.  The normal user's conception of what "garbage" is
has nothing to do with cycles.  It just so happens that in CPython,
that's the main thing the garbage collector collects.
msg91635 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2009-08-16 17:08
I still stand by my suggestion to free memory as a side effect of
garbage collection. It may well be that an application will start
re-allocating blocks that soon end up in the free list again. OTOH, it
may also be that releasing those free lists, along with the freeing that
the GC just did, can cause arenas to become completely free, and thus be
returned to the operating system. Users really really want Python to
return memory to the operating system whenever possible, and on its own;
those free lists can block memory from being returned, more or less
unreasonably.

So unless it can be demonstrated (preferably in a realistic application)
that clearing the free lists has a measurable negative impact, I propose
to keep things the way they are.

IMO, it would be best if we could eliminate the freelists altogether.
msg91637 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2009-08-16 18:23
+1 on the PyXXX_ClearFreeList patch and calling them from gc.collect()
as is done with the others.

I agree with Guido, don't add a tp_free_list slot as the common case
would be NULL.

Regarding gc clearing freelists: I agree with Antoine and Martin. 
Clearing free lists in the highest generation of GC is a very good
thing.  Rebuilding them infrequently should not have a significant
performance impact and makes long running python jobs better behaved by
releasing more memory when possible.
msg91659 - (view) Author: Matthias Troffaes (matthiastroffaes) Date: 2009-08-17 11:43
To aid the discussion, I attach another quick patch which reports the
time spent on PyXXX_ClearFreeList calls during highest generation
garbage collection (including gc.collect()). For simplicity, the timer
uses clock() so the resolution is quite limited (appears to be 10ms on
my machine) and I don't claim that this is the best way of measuring
execution speed, but at least it gives some indication.

The patch also gives an indication at how frequently the highest
generation is collected.

Below is the result of the patch on the py3k-freelist_test.py test
script, on my machine (again, debug build). For reference, I've measured
the total time spent by the script as well, with the time command.

Summarizing, (30+70+30+30+50+70+20+110)/48420.0 = 0.0085 = 0.85% of time
is spent on freeing the freelists, in my test.

Another way to look at the data is that it roughly takes 10ms for each
100MB allocated (at least for the types of data in the script). Floats
seem to be an exception and take at least twice as long (not sure why).

Keep in mind that the test merely allocates and deallocates memory,
without doing much else, so it isn't a typical Python application.

$ time ./python py3k-freelist_test.py

Memory used (begin): 121Mb

memtest 2000000 int
====================
Memory used (peak): 297Mb
cleared free lists in 0 clock ticks (0.000000ms)
Memory used (end): 121Mb
Unfreed memory: 0Mb

memtest 2000000 str
====================
Memory used (peak): 455Mb
cleared free lists in 30000 clock ticks (30.000000ms)
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 float
====================
Memory used (peak): 236Mb
cleared free lists in 70000 clock ticks (70.000000ms)
Memory used (end): 127Mb
Unfreed memory: 6Mb

memtest 2000000 int
====================
Memory used (peak): 312Mb
cleared free lists in 0 clock ticks (0.000000ms)
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 Test
====================
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
Memory used (peak): 372Mb
cleared free lists in 30000 clock ticks (30.000000ms)
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 Test2
====================
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
Memory used (peak): 361Mb
cleared free lists in 30000 clock ticks (30.000000ms)
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 _tuple
====================
Memory used (peak): 529Mb
cleared free lists in 50000 clock ticks (50.000000ms)
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 _set
====================
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
Memory used (peak): 846Mb
cleared free lists in 70000 clock ticks (70.000000ms)
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 200000 _dict
====================
Memory used (peak): 233Mb
cleared free lists in 20000 clock ticks (20.000000ms)
Memory used (end): 123Mb
Unfreed memory: 2Mb

memtest 2000000 Test3
====================
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
cleared free lists in 0 clock ticks (0.000000ms)
Memory used (peak): 999Mb
cleared free lists in 110000 clock ticks (110.000000ms)
Memory used (end): 123Mb
Unfreed memory: 2Mb
cleared free lists in 0 clock ticks (0.000000ms)
[40763 refs]

real	0m50.921s
user	0m48.420s
sys	0m2.330s
msg106231 - (view) Author: Matthias Troffaes (matthiastroffaes) Date: 2010-05-21 12:30
I uploaded updates of the three relevant patches against the current revision of the py3k branch, as the old patches no longer applied cleanly due to whitespace changes.

To summarize:

* The first patch, py3k-rev81387-clearfreelist-dict_set_list.patch, simply adds freelist methods to the public API for dict, list, and set. No opposition has been expressed against this, so I hope this can be accepted.

* The second patch, py3k-rev81387-clearfreelist-gc_collect.patch, adds calls to these methods to gc.collect() - some opposition was expressed against the (already present before my patch!!) method of freeing lists during highest generation garbage collection. I attempted to measure the actual time spent on freeing the freelists in a simply python program which does a lot of allocation (attached as py3k-freelist_test.py). This apparently shows that clearing the freelists does not affect timing much at all.

* The third patch, file py3k-rev81387-clearfreelist-time_gc_collect.patch, causes estimates of the time spent on freeing the freelists to be printed to the console, and is obviously for testing/benchmarking purpose only.

* The tp_free_list patch is no longer relevant (see comment by Guido).

Hoping for a conclusion of this issue,
Matthias
msg141627 - (view) Author: Matthias Troffaes (matthiastroffaes) Date: 2011-08-04 13:44
Patch against current tip attached.

I can no longer reproduce the large memory leaks with the current tip (which is of course wonderful!), so I guess the second part of the patch (freeing the freelists during gc.collect) makes no longer sense.
msg149609 - (view) Author: Roundup Robot (python-dev) Date: 2011-12-16 10:25
New changeset 57f0af61da53 by Antoine Pitrou in branch 'default':
Issue #6695: Full garbage collection runs now clear the freelist of set objects.
http://hg.python.org/cpython/rev/57f0af61da53
msg149610 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-12-16 10:26
The clearing the dict and list freelists has been added independently, so I committed the part of the patch pertaining to sets. Thank you!
History
Date User Action Args
2011-12-16 10:26:26pitrousetstatus: open -> closed
versions: + Python 3.3, - Python 3.2
messages: + msg149610

resolution: fixed
stage: resolved
2011-12-16 10:25:43python-devsetnosy: + python-dev
messages: + msg149609
2011-08-04 13:44:05matthiastroffaessetfiles: + py3k-04082011-clearfreelist-dict_set_list.patch

messages: + msg141627
2010-05-21 15:26:04hayposetnosy: + haypo
2010-05-21 12:30:45matthiastroffaessetmessages: + msg106231
2010-05-21 12:12:56matthiastroffaessetfiles: + py3k-rev81387-clearfreelist-time_gc_collect.patch
2010-05-21 12:12:17matthiastroffaessetfiles: + py3k-rev81387-clearfreelist-gc_collect.patch
2010-05-21 12:11:40matthiastroffaessetfiles: + py3k-rev81387-clearfreelist-dict_set_list.patch
2010-05-20 20:31:21skip.montanarosetnosy: - skip.montanaro
2009-08-17 11:43:11matthiastroffaessetfiles: + py3k-clearfreelist-time_gc_collect.patch

messages: + msg91659
2009-08-16 18:23:47gregory.p.smithsetnosy: + gregory.p.smith
messages: + msg91637
2009-08-16 17:08:03loewissetmessages: + msg91635
2009-08-16 15:57:02rhettingersetresolution: accepted -> (no value)
2009-08-16 15:18:28r.david.murraysetnosy: + r.david.murray
messages: + msg91634
2009-08-16 15:13:55pitrousetnosy: + loewis, christian.heimes
messages: + msg91633
2009-08-16 14:43:54pitrousetmessages: + msg91631
2009-08-16 14:28:04skip.montanarosetmessages: + msg91630
2009-08-15 22:50:43pitrousetmessages: + msg91623
2009-08-15 22:06:15rhettingersetmessages: + msg91618
2009-08-15 17:35:02pitrousetnosy: + pitrou
messages: + msg91613
2009-08-15 05:37:49rhettingersetmessages: + msg91593
2009-08-14 18:19:58rhettingersetresolution: accepted
messages: + msg91565
2009-08-14 17:19:33gvanrossumsetassignee: gvanrossum ->
messages: + msg91563
2009-08-14 17:04:05rhettingersetassignee: gvanrossum

messages: + msg91561
nosy: + gvanrossum, rhettinger
2009-08-14 16:23:26matthiastroffaessetfiles: + py3k-tp_free_list-dict.patch

messages: + msg91558
2009-08-13 18:00:56skip.montanarosetnosy: + skip.montanaro
messages: + msg91526
2009-08-13 16:57:44matthiastroffaessetfiles: + py3k-freelist_test.py

messages: + msg91523
2009-08-13 16:49:07matthiastroffaessetfiles: + py3k-clearfreelist-gc_collect.patch

messages: + msg91521
2009-08-13 16:45:30matthiastroffaescreate