classification
Title: Garbage Collection makes some object live for very long
Type: resource usage Stage:
Components: Interpreter Core Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: mistasse, pablogsal
Priority: normal Keywords:

Created on 2019-12-16 10:01 by mistasse, last changed 2019-12-16 19:06 by pablogsal.

Files
File name Uploaded Description Edit
late_gc.py mistasse, 2019-12-16 10:01 Example program where the unnecessary growing of memory consumption is observable
Messages (4)
msg358469 - (view) Author: Maxime Istasse (mistasse) Date: 2019-12-16 10:01
When working on a self-referencing object in the young generation and the middle-generation collection kicks in, that object is directly moved to the old generation. (if I understood this well: https://github.com/python/cpython/blob/d68b592dd67cb87c4fa862a8d3b3fd0a7d05e113/Modules/gcmodule.c#L1192)
Then, it won't be freed until the old generation is collected, which happens to be much later. (because of this: https://github.com/python/cpython/blob/d68b592dd67cb87c4fa862a8d3b3fd0a7d05e113/Modules/gcmodule.c#L1388)

It happens to cause huge memory leaks if the self-referencing objects occupies a lot of RAM, which should be expected.

This is of course the kind of problem that I expect with garbage collection with bad parameters.

However, I also expected that playing with threshold0 could have been sufficient to solve it. However, the fact that we move the object to old generation every time the middle collection pops in forces the problem to happen once in a while, and in the end reaching very high memory consumption.

I think the best and simplest solution would be to move the objects one generation at a time. This would avoid the heavy but short-lived objects to make it to the old generation.
msg358486 - (view) Author: Maxime Istasse (mistasse) Date: 2019-12-16 13:10
TLDR; a short-lived object can make it directly from young generation to old generation if middle generation collection kicks in while it is not freeable yet. Old generation is very rarely collected. Several of those objects, if they imply cyclic references, can therefore stack there and use a lot of RAM if big objects are attached to them. (if no cyclic refs, refcount goes to 0 and everything is OK)

This seems to happen in 3.8 as well, most likely in old versions as well. To me, those conditions shouldn't be exceptional enough to be ignored. 
I'm beginning to work on a fix, no guarantee yet though...
msg358488 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-12-16 13:30
> It happens to cause huge memory leaks if the self-referencing objects occupies a lot of RAM, which should be expected.

The fact that the GC will take longer time does not qualify this as memory leaks. A memory leak is by definition memory that cannot be reclaimed and in this case, once the collection of the old generation happens it will be collected, therefore is not a "leak" per se.

> I think the best and simplest solution would be to move the objects one generation at a time. This would avoid the heavy but short-lived objects to make it to the old generation.

This has also other downsides, like objects that won't be collected will suffer more traversals and collections, that can be impactful in performance, so is not that simple. 

I am currently working on an experiment to see if we can detect "nepotism" (check https://www.memorymanagement.org/glossary/n.html for a definition) and this will likely help with your problem.

In the meanwhile, I think the most portable option is forcing collections yourself or adjusting the gc parameters.
msg358491 - (view) Author: Maxime Istasse (mistasse) Date: 2019-12-16 14:31
> The fact that the GC will take longer time does not qualify this as memory leaks. A memory leak is by definition memory that cannot be reclaimed and in this case, once the collection of the old generation happens it will be collected, therefore is not a "leak" per se.

I completely agree, I did not know what to call that. It's just that I was really believing to write GC-friendly code, but that my assumptions were very wrong. The result of my investigation seems so counter-intuitive that I tend to believe this is a bug introduced by a quick implementation shortcut and an optimization.

I wouldn't have reported it if I had been able to mitigate it by setting the gc parameters. My first idea was to set a threshold0 such that I'm certain I don't keep references for that amount of time. But the one I'm using at the moment of the second generation collection will go in the old generation, it is the only one but it will for sure. As those are the only "new" objects that go to the old generation, it really takes a long time for it to grow sufficiently to get collected.

> This has also other downsides, like objects that won't be collected will suffer more traversals and collections, that can be impactful in performance, so is not that simple.

I don't think it will be that impactful because of traversals and collections. I think it was mostly convenient to merge the generations in the "collect" function, and that not merging them will be a bit more tedious.

If there is a second gen collection every 10 young gen collection, then it just introduces some more objects in the next second gen collection every second gen collection rather than putting them in the old generation directly.

To me, it is unintended that all the objects that are reachable during a second gen collection are put in the old generation. There is a high probability we have some short-lived objects there. It wouldn't be as problematic to traverse them once more in the next second gen. I tend to believe it is the purpose of the old gen not to be reachable in less than 2 passes.

Hopefully, for my personal, non artificial case, there are other assumptions I can make so I used weakrefs. I have one parent with children pointing to it, they just point with weakrefs now, and I know I always keep a reference to the parent or are OK to let them all go as refcount(parent) == 0. I'm very grateful I don't have to do gc.collect, which indeed was the next option.

Thank you for taking the problem seriously, and for the time you may dedicate to it. No need to be quick, I just wanted to raise that question. I am of course interested in the bad consequences it could bring (at the core of such a broadly used language, I would expect there are some), but at the same time, it is such a rare event and very localized and counter-intuitive in the implementation that it would surprise me.
History
Date User Action Args
2019-12-16 19:06:33pablogsalsettitle: Garbage Collection optimizations cause "memory leak" -> Garbage Collection makes some object live for very long
2019-12-16 14:31:32mistassesetmessages: + msg358491
2019-12-16 13:30:56pablogsalsetmessages: + msg358488
2019-12-16 13:14:55xtreaksetnosy: + pablogsal
2019-12-16 13:10:29mistassesetmessages: + msg358486
versions: - Python 3.7
2019-12-16 10:01:34mistassecreate