Title: Garbage Collector Ignoring Some (Not All) Circular References of Identical Type
Type: resource usage Stage: resolved
Components: Versions: Python 3.7
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: ian_osh, tim.peters
Priority: normal Keywords:

Created on 2020-07-24 19:56 by ian_osh, last changed 2020-07-27 23:59 by tim.peters. This issue is now closed.

File name Uploaded Description Edit ian_osh, 2020-07-24 19:56 Sample script that causes the issue
Messages (8)
msg374210 - (view) Author: Ian O'Shaughnessy (ian_osh) Date: 2020-07-24 19:56
Using a script that has two classes A and B which contain a circular reference variable, it is possible to cause a memory leak that is not captured by default gc collection. Only by running gc.collect() manually do the circular references get collected.

Attached is a sample script that replicates the issue.

Output starts:

Ram used: 152.17 MB - A: Active(125) / Total(2485) - B: Active(124) / Total(2484)
Ram used: 148.17 MB - A: Active(121) / Total(12375) - B: Active(120) / Total(12374)
Ram used: 65.88 MB - A: Active(23) / Total(22190) - B: Active(22) / Total(22189)
Ram used: 77.92 MB - A: Active(35) / Total(31935) - B: Active(34) / Total(31934)

After 1,000,000 cycles 1GB of ram is being consumed:

Ram used: 1049.68 MB - A: Active(1019) / Total(975133) - B: Active(1018) / Total(975132)
Ram used: 1037.64 MB - A: Active(1007) / Total(984859) - B: Active(1006) / Total(984858)
Ram used: 952.34 MB - A: Active(922) / Total(994727) - B: Active(921) / Total(994726)
Ram used: 970.41 MB - A: Active(940) / Total(1000000) - B: Active(940) / Total(1000000)
msg374213 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2020-07-24 20:48
I see no evidence of a bug here. To the contrary, the output proves that __del__ methods are getting called all along. And if garbage weren't being collected, after allocating a million objects each with its own megabyte string object, memory use at the end would be a terabyte, not a comparatively measly ;-) gigabyte

Note that Python's cyclic gc is NOT asynchronous. It only runs when you call it directly, or when an internal count of allocations exceeds an internal count of deallocations. When your loop ends, your output shows that 940 A and B objects remain to be collected, spread across some number of the gc's "generations". That's where your gigabyte lives (about a thousand A objects each with its own megabyte of string data). It will remain in use until gc is forced to run again. But 99.9% of the A objects have already been collected.
msg374214 - (view) Author: Ian O'Shaughnessy (ian_osh) Date: 2020-07-24 20:52
For a long running process (greatly exceeding a million iterations) the uncollected garbage will become too large for the system (many gigabytes). A manual execution of the gc would be required.

That seems flawed given that Python is a garbage collected language, no?
msg374215 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2020-07-24 21:13
What makes you think that? Your own output shows that the number of "Active" objects does NOT monotonically increase across output lines. It goes up sometimes, and down sometimes.  Whether it goes up or down is entirely due to accidents of when your monitoring thread happens to wake up during the lifetime of the program's gc history.

I boosted the loop count to 10 million on my box just now. It had no significant effect on peak memory use. At the end:

(298, 10000000, 298, 10000000)
>>> gc.collect()
(1, 10000000, 1, 10000000)

There is no leak. An A and B object survive collect() because the last A object created remains bound to the variable `a` used in the loop (so is still reachable).

So I thank you for creating a nice test program, but I'm closing this since it doesn't demonstrate a real problem.
msg374422 - (view) Author: Ian O'Shaughnessy (ian_osh) Date: 2020-07-27 20:38
"Leak" was likely the wrong word.

It does appear problematic though.

The loop is using a fixed number of variables (yes, there are repeated dynamic allocations, but they fall out of scope with each iteration), only one of these variables occupies 1MB of ram (aside from the static variable).

The problem: There's only really one variable occupying 1MB of in-scope memory, yet the app's memory usage can/will exceed 1GB after extended use.

At the very least, this is confusing – especially given the lack of user control to prevent it from happening once it's discovered as a problem.
msg374424 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2020-07-27 20:58
It's impossible for any implementation to know that cyclic trash _is_ trash without, in some way, traversing the object graph. This is expensive, so CPython (or any other language) does not incur that expense after every single decref that leaves a non-zero refcount (the one and only cheap clue that cyclic trash _may_ have just been created).

If you want/need synchronous behavior, avoid cycles. CPython's refcounting does dispose of trash the instant an object (not involved in a cycle) becomes trash. That behavior cannot be extended to cyclic trash short of (as above) running a cyclic gc pass extremely frequently.

I don't know of any language that guarantees all garbage will be collected "right away". Do you?  CPython does much more in that respect (due to primarily relying on refcounting) than most.
msg374427 - (view) Author: Ian O'Shaughnessy (ian_osh) Date: 2020-07-27 22:19
>I don't know of any language that guarantees all garbage will be collected "right away". Do you?

I'm not an expert in this domain, so, no. I am however attempting to find a way to mitigate this issue. Do you have any suggestions how I can avoid these memory spikes? Weak references? Calling gc.collect() on regular intervals doesn't seem to work consistently.
msg374445 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2020-07-27 23:59
Well, this isn't a help desk ;-) You may want instead to detail your problem on, say, StackOverflow, or the general Python mailing list.

Please note that I don't know what your "problem" _is_:  you haven't said. You posted some numbers that didn't make sense to you, and made unwarranted extrapolations from those (for example, no, those numbers won't get worse if you let the program run a million times longer).

So you should spell out what the "real" problem is. This shows signs of being an "XY problem":
For example, now you say:

> Calling gc.collect() on regular intervals doesn't seem
> to work consistently

That's news to me. The code you posted shows quite different behavior when FORCE_GC is set.

But if it's true that calling gc.collect() regularly doesn't alleviate "the real problem" (whatever that may be!), then that shows the opposite of what you appear to be assuming: that Python's cyclic gc is the root of the cause. collect() _will_ reclaim every scrap of RAM that's actually trash at the time it's called. So if calling that doesn't help, the problem is almost certainly NOT that trash isn't getting reclaimed. Something else is the cause.

Examples: it's not actually trash. It is, and gc collects it, but is unable to return it to the C library from which the memory came. It is returned to the C library, but that in turn is unable to return the memory to the OS. It is returned to the OS, but the OS decides to leave its virtual address space mapped to the process for now.

Details not only matter, they _can_ be everything when dealing with the multiple layers of memory management on modern machines. Waiting for a "general insight" is probably futile here :-(
Date User Action Args
2020-07-27 23:59:15tim.peterssetmessages: + msg374445
2020-07-27 22:19:59ian_oshsetmessages: + msg374427
2020-07-27 20:58:43tim.peterssetmessages: + msg374424
2020-07-27 20:38:05ian_oshsetmessages: + msg374422
2020-07-24 21:13:11tim.peterssetstatus: open -> closed
resolution: not a bug
messages: + msg374215

stage: resolved
2020-07-24 20:52:20ian_oshsetmessages: + msg374214
2020-07-24 20:48:32tim.peterssetnosy: + tim.peters
messages: + msg374213
2020-07-24 19:56:23ian_oshcreate