classification
Title: Empty __slots__ can create untracked reference cycles
Type: resource usage Stage: resolved
Components: Interpreter Core Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: brandtbucher Nosy List: brandtbucher, miss-islington, nascheme, pablogsal, tim.peters
Priority: low Keywords: patch

Created on 2020-10-09 19:17 by brandtbucher, last changed 2020-10-15 15:51 by brandtbucher. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 22701 merged brandtbucher, 2020-10-14 23:36
PR 22702 merged miss-islington, 2020-10-15 01:44
PR 22703 closed miss-islington, 2020-10-15 01:44
PR 22707 merged miss-islington, 2020-10-15 03:41
Messages (16)
msg378337 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-09 19:17
Currently, we don't track instances of certain heap types based on the assumption that "no members" == "no reference cycles".

Unfortunately, it's still possible to create untracked reference cycles with one's parents. The following program leaks memory:

while True:
    class C:
        __slots__ = ()
    C.i = C()
    del C

The fix is simple: track all instances of user-defined classes, no exceptions. I'm not sure we were actually getting any real wins from the old behavior anyways.
msg378339 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-10-09 19:46
Thanks for noticing this, Brandt!

> The fix is simple: track all instances of user-defined classes, no exceptions

I have the fear that this may increase the time of collections substantially because even if many of them won't have references at all, they still net to be visited and they count towards the time complexity of the algorithm that detects isolated sub-cycles. I mention this because this could affect an unbounded number of instances per type. For example, some of the types that are untracked.

Maybe I am missing something but we could mark them as having GC support unconditionally but still leave them untracking and unconditionally add a tracking call on setattribute.
msg378356 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-09 23:19
> Maybe I am missing something but we could mark them as having GC support unconditionally but still leave them untracking and unconditionally add a tracking call on setattribute.

Hm, I’m not sure that would be enough. Consider the case of a class that registers its instances in a collection of some sort:

while True:
    class D:
        __slots__ = ()
        registry = []
        def __init__(self):
            self.registry.append(self)
    for i in range(100):
        D() and None  # Suppress REPL output.
    del D

This is probably more common (and problematic) than my example above.

At the same time, I agree that it’s not *ideal* to track all of these objects automatically. Anyone setting __slots__ is probably planning on creating lots of “cheap” instances. If they do accidentally create cycles, though, I feel the memory hit then would be worse than any collection overhead.

I’m just not sure I see a way to fix this without tracking them all.
msg378358 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-10-10 00:40
> I’m just not sure I see a way to fix this without tracking them all.

IIRC we do skip the GC flags for user-created types only when the subtype is not adding new variables *and* the base class is not a GC class by itself. This includes the case with __slots__ but is not limited to. If I recall correctly, there is a bunch of metaclasses that fall into this category and some other minor things so maybe is not that bad to unconditionally make all user objects tracked.

In any case I think is prudent to run the performance test suite with PGO/LTO + CPU isolation to get an idea. Unfortinately we already have some unwanted 3.9 performance regressions of unknown origin and I would like to not add to it if we can.
msg378637 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-14 17:17
Using the following patch:

https://github.com/python/cpython/compare/master...brandtbucher:track-all-heap-types

I got the following pyperformance results (with PGO/LTO and CPU isolation, insignificant rows omitted):

2020-10-13_20-04-master-7992579cd27f.json.gz
============================================

Performance version: 1.0.0
Report on Linux-4.4.0-190-generic-x86_64-with-glibc2.23
Number of logical CPUs: 8
Start date: 2020-10-14 07:49:46.738847
End date: 2020-10-14 08:12:58.050557

2020-10-13_20-37-track-all-heap-types-63d8d25867b5.json.gz
==========================================================

Performance version: 1.0.0
Report on Linux-4.4.0-190-generic-x86_64-with-glibc2.23
Number of logical CPUs: 8
Start date: 2020-10-14 08:29:52.009796
End date: 2020-10-14 08:52:49.230214

+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| Benchmark               | 2020-10-13_20-04-master-7992579cd27f.json.gz | 2020-10-13_20-37-track-all-heap-types-63d8d25867b5.json.gz | Change       | Significance          |
+=========================+==============================================+============================================================+==============+=======================+
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| chameleon               | 13.7 ms                                      | 13.1 ms                                                    | 1.04x faster | Significant (t=7.27)  |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| chaos                   | 163 ms                                       | 159 ms                                                     | 1.03x faster | Significant (t=5.39)  |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| crypto_pyaes            | 160 ms                                       | 156 ms                                                     | 1.02x faster | Significant (t=7.29)  |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| deltablue               | 10.9 ms                                      | 10.7 ms                                                    | 1.03x faster | Significant (t=4.36)  |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| django_template         | 73.1 ms                                      | 71.2 ms                                                    | 1.03x faster | Significant (t=4.42)  |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| go                      | 373 ms                                       | 381 ms                                                     | 1.02x slower | Significant (t=-4.71) |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| nbody                   | 206 ms                                       | 200 ms                                                     | 1.03x faster | Significant (t=7.52)  |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| regex_dna               | 295 ms                                       | 288 ms                                                     | 1.02x faster | Significant (t=5.26)  |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| richards                | 103 ms                                       | 101 ms                                                     | 1.03x faster | Significant (t=5.40)  |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| scimark_sparse_mat_mult | 6.96 ms                                      | 7.17 ms                                                    | 1.03x slower | Significant (t=-4.93) |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| spectral_norm           | 218 ms                                       | 207 ms                                                     | 1.05x faster | Significant (t=10.29) |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
| unpack_sequence         | 101 ns                                       | 84.1 ns                                                    | 1.20x faster | Significant (t=36.84) |
+-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+

Note that I used pyperformance==1.0.0, since 1.0.1 won't run on 3.10.

In general, it looks like the patch has no real performance impact. The result for unpack_sequence is *extremely* surprising... I'm quite skeptical of it, although it looks like it is one of the more "micro" benchmarks in the suite.

All things considered, I believe we should fix this... I personally view memory leaks as second only to crashes and incorrect behavior in terms of severity.
msg378638 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-10-14 17:30
Thanks a lot, Brandt for doing the benchmarking!

I think that unpack_sequence is **very** affected by code locality. I was debugging it recently with Linux perf (https://perf.wiki.kernel.org/index.php/Main_Page) and it shows that is extremely affected by L1-L2 cache performance. I still need to understand why, but yeah, I am fine ignoring it.

>All things considered, I believe we should fix this... I personally view memory leaks as second only to crashes and incorrect behavior in terms of severity.

Yeah, I very much agree with you. Could you convert your patch into a PR?
msg378639 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-14 17:55
No problem. I just need to rework some hacks in test_finalization where we use empty __slots__ to create non-GC types. I think I can just create a simple utility in _testcapi to untrack instances for this purpose.
msg378640 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-10-14 17:57
>  I think I can just create a simple utility in _testcapi to untrack instances for this purpose.

I have the feeling that I am misunderstanding what you refer to, but if is **untrack** what you need, there is gc.untrack()
msg378641 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-14 18:03
> there is gc.untrack()

I'm not sure that's true... :)

Although perhaps track()/untrack() functions *could* be useful to add to the gc module... but that's a separate conversation.
msg378642 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-10-14 18:05
> I'm not sure that's true... :)

Oh boy, I need to....ehem....yeah I will be back in....one second...
msg378643 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020-10-14 18:06
In any case, you can expose:

https://docs.python.org/3/c-api/gcsupport.html#c.PyObject_GC_UnTrack

in the testcapi module
msg378648 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-14 18:48
I'll hold off until https://bugs.python.org/issue42039 is resolved.
msg378653 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-14 23:00
Actually, never mind. Because of the way finalization works, we need to create untracked *types* for these tests, not untracked *instances*.

PR up soon.
msg378655 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-15 01:44
New changeset c13b847a6f913b72eeb71651ff626390b738d973 by Brandt Bucher in branch 'master':
bpo-41984: GC track all user classes (GH-22701)
https://github.com/python/cpython/commit/c13b847a6f913b72eeb71651ff626390b738d973
msg378661 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-15 03:38
New changeset d197b2bb3e401bed53987b65a7ceb6c712c4f5bd by Miss Skeleton (bot) in branch '3.9':
bpo-41984: GC track all user classes (GH-22701/GH-22702)
https://github.com/python/cpython/commit/d197b2bb3e401bed53987b65a7ceb6c712c4f5bd
msg378688 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-10-15 15:51
New changeset aeb66c1abbf4ec214e2e80eb972546996d1a1571 by Miss Skeleton (bot) in branch '3.8':
bpo-41984: GC track all user classes (GH-22701/GH-22707)
https://github.com/python/cpython/commit/aeb66c1abbf4ec214e2e80eb972546996d1a1571
History
Date User Action Args
2020-10-15 15:51:56brandtbuchersetmessages: + msg378688
2020-10-15 03:41:40miss-islingtonsetpull_requests: + pull_request21676
2020-10-15 03:38:32brandtbuchersetmessages: + msg378661
2020-10-15 01:46:42brandtbuchersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-10-15 01:44:44miss-islingtonsetpull_requests: + pull_request21673
2020-10-15 01:44:37miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request21672
2020-10-15 01:44:15brandtbuchersetmessages: + msg378655
2020-10-14 23:36:51brandtbuchersetkeywords: + patch
stage: patch review
pull_requests: + pull_request21671
2020-10-14 23:00:53brandtbuchersetmessages: + msg378653
2020-10-14 18:48:42brandtbuchersetmessages: + msg378648
2020-10-14 18:06:31pablogsalsetmessages: + msg378643
2020-10-14 18:05:32pablogsalsetmessages: + msg378642
2020-10-14 18:03:44brandtbuchersetmessages: + msg378641
2020-10-14 17:57:20pablogsalsetmessages: + msg378640
2020-10-14 17:55:05brandtbuchersetmessages: + msg378639
2020-10-14 17:30:44pablogsalsetmessages: + msg378638
2020-10-14 17:17:36brandtbuchersetmessages: + msg378637
2020-10-10 00:40:21pablogsalsetmessages: + msg378358
2020-10-09 23:19:36brandtbuchersetmessages: + msg378356
2020-10-09 19:46:26pablogsalsetnosy: + tim.peters, nascheme
messages: + msg378339
2020-10-09 19:17:48brandtbuchercreate