Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty __slots__ can create untracked reference cycles #86150

Closed
brandtbucher opened this issue Oct 9, 2020 · 16 comments
Closed

Empty __slots__ can create untracked reference cycles #86150

brandtbucher opened this issue Oct 9, 2020 · 16 comments
Assignees
Labels
3.8 only security fixes 3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@brandtbucher
Copy link
Member

BPO 41984
Nosy @tim-one, @nascheme, @pablogsal, @miss-islington, @brandtbucher
PRs
  • bpo-41984: GC track all user classes #22701
  • [3.9] bpo-41984: GC track all user classes (GH-22701) #22702
  • [3.8] bpo-41984: GC track all user classes (GH-22701) #22703
  • [3.8] bpo-41984: GC track all user classes (GH-22701) #22707
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brandtbucher'
    closed_at = <Date 2020-10-15.01:46:42.602>
    created_at = <Date 2020-10-09.19:17:48.839>
    labels = ['interpreter-core', '3.8', '3.9', '3.10', 'performance']
    title = 'Empty __slots__ can create untracked reference cycles'
    updated_at = <Date 2020-10-15.15:51:56.002>
    user = 'https://github.com/brandtbucher'

    bugs.python.org fields:

    activity = <Date 2020-10-15.15:51:56.002>
    actor = 'brandtbucher'
    assignee = 'brandtbucher'
    closed = True
    closed_date = <Date 2020-10-15.01:46:42.602>
    closer = 'brandtbucher'
    components = ['Interpreter Core']
    creation = <Date 2020-10-09.19:17:48.839>
    creator = 'brandtbucher'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 41984
    keywords = ['patch']
    message_count = 16.0
    messages = ['378337', '378339', '378356', '378358', '378637', '378638', '378639', '378640', '378641', '378642', '378643', '378648', '378653', '378655', '378661', '378688']
    nosy_count = 5.0
    nosy_names = ['tim.peters', 'nascheme', 'pablogsal', 'miss-islington', 'brandtbucher']
    pr_nums = ['22701', '22702', '22703', '22707']
    priority = 'low'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue41984'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @brandtbucher
    Copy link
    Member Author

    Currently, we don't track instances of certain heap types based on the assumption that "no members" == "no reference cycles".

    Unfortunately, it's still possible to create untracked reference cycles with one's parents. The following program leaks memory:

    while True:
        class C:
            __slots__ = ()
        C.i = C()
        del C

    The fix is simple: track all instances of user-defined classes, no exceptions. I'm not sure we were actually getting any real wins from the old behavior anyways.

    @brandtbucher brandtbucher added 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes labels Oct 9, 2020
    @brandtbucher brandtbucher self-assigned this Oct 9, 2020
    @brandtbucher brandtbucher added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes labels Oct 9, 2020
    @brandtbucher brandtbucher self-assigned this Oct 9, 2020
    @brandtbucher brandtbucher added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Oct 9, 2020
    @pablogsal
    Copy link
    Member

    Thanks for noticing this, Brandt!

    The fix is simple: track all instances of user-defined classes, no exceptions

    I have the fear that this may increase the time of collections substantially because even if many of them won't have references at all, they still net to be visited and they count towards the time complexity of the algorithm that detects isolated sub-cycles. I mention this because this could affect an unbounded number of instances per type. For example, some of the types that are untracked.

    Maybe I am missing something but we could mark them as having GC support unconditionally but still leave them untracking and unconditionally add a tracking call on setattribute.

    @brandtbucher
    Copy link
    Member Author

    Maybe I am missing something but we could mark them as having GC support unconditionally but still leave them untracking and unconditionally add a tracking call on setattribute.

    Hm, I’m not sure that would be enough. Consider the case of a class that registers its instances in a collection of some sort:

    while True:
        class D:
            __slots__ = ()
            registry = []
            def __init__(self):
                self.registry.append(self)
        for i in range(100):
            D() and None  # Suppress REPL output.
        del D

    This is probably more common (and problematic) than my example above.

    At the same time, I agree that it’s not *ideal* to track all of these objects automatically. Anyone setting __slots__ is probably planning on creating lots of “cheap” instances. If they do accidentally create cycles, though, I feel the memory hit then would be worse than any collection overhead.

    I’m just not sure I see a way to fix this without tracking them all.

    @pablogsal
    Copy link
    Member

    I’m just not sure I see a way to fix this without tracking them all.

    IIRC we do skip the GC flags for user-created types only when the subtype is not adding new variables *and* the base class is not a GC class by itself. This includes the case with __slots__ but is not limited to. If I recall correctly, there is a bunch of metaclasses that fall into this category and some other minor things so maybe is not that bad to unconditionally make all user objects tracked.

    In any case I think is prudent to run the performance test suite with PGO/LTO + CPU isolation to get an idea. Unfortinately we already have some unwanted 3.9 performance regressions of unknown origin and I would like to not add to it if we can.

    @brandtbucher
    Copy link
    Member Author

    Using the following patch:

    master...brandtbucher:track-all-heap-types

    I got the following pyperformance results (with PGO/LTO and CPU isolation, insignificant rows omitted):

    2020-10-13_20-04-master-7992579cd27f.json.gz
    ============================================

    Performance version: 1.0.0
    Report on Linux-4.4.0-190-generic-x86_64-with-glibc2.23
    Number of logical CPUs: 8
    Start date: 2020-10-14 07:49:46.738847
    End date: 2020-10-14 08:12:58.050557

    2020-10-13_20-37-track-all-heap-types-63d8d25867b5.json.gz
    ==========================================================

    Performance version: 1.0.0
    Report on Linux-4.4.0-190-generic-x86_64-with-glibc2.23
    Number of logical CPUs: 8
    Start date: 2020-10-14 08:29:52.009796
    End date: 2020-10-14 08:52:49.230214

    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | Benchmark | 2020-10-13_20-04-master-7992579cd27f.json.gz | 2020-10-13_20-37-track-all-heap-types-63d8d25867b5.json.gz | Change | Significance |
    +=========================+==============================================+============================================================+==============+=======================+
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | chameleon | 13.7 ms | 13.1 ms | 1.04x faster | Significant (t=7.27) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | chaos | 163 ms | 159 ms | 1.03x faster | Significant (t=5.39) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | crypto_pyaes | 160 ms | 156 ms | 1.02x faster | Significant (t=7.29) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | deltablue | 10.9 ms | 10.7 ms | 1.03x faster | Significant (t=4.36) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | django_template | 73.1 ms | 71.2 ms | 1.03x faster | Significant (t=4.42) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | go | 373 ms | 381 ms | 1.02x slower | Significant (t=-4.71) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | nbody | 206 ms | 200 ms | 1.03x faster | Significant (t=7.52) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | regex_dna | 295 ms | 288 ms | 1.02x faster | Significant (t=5.26) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | richards | 103 ms | 101 ms | 1.03x faster | Significant (t=5.40) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | scimark_sparse_mat_mult | 6.96 ms | 7.17 ms | 1.03x slower | Significant (t=-4.93) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | spectral_norm | 218 ms | 207 ms | 1.05x faster | Significant (t=10.29) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+
    | unpack_sequence | 101 ns | 84.1 ns | 1.20x faster | Significant (t=36.84) |
    +-------------------------+----------------------------------------------+------------------------------------------------------------+--------------+-----------------------+

    Note that I used pyperformance==1.0.0, since 1.0.1 won't run on 3.10.

    In general, it looks like the patch has no real performance impact. The result for unpack_sequence is *extremely* surprising... I'm quite skeptical of it, although it looks like it is one of the more "micro" benchmarks in the suite.

    All things considered, I believe we should fix this... I personally view memory leaks as second only to crashes and incorrect behavior in terms of severity.

    @pablogsal
    Copy link
    Member

    Thanks a lot, Brandt for doing the benchmarking!

    I think that unpack_sequence is **very** affected by code locality. I was debugging it recently with Linux perf (https://perf.wiki.kernel.org/index.php/Main_Page) and it shows that is extremely affected by L1-L2 cache performance. I still need to understand why, but yeah, I am fine ignoring it.

    All things considered, I believe we should fix this... I personally view memory leaks as second only to crashes and incorrect behavior in terms of severity.

    Yeah, I very much agree with you. Could you convert your patch into a PR?

    @brandtbucher
    Copy link
    Member Author

    No problem. I just need to rework some hacks in test_finalization where we use empty __slots__ to create non-GC types. I think I can just create a simple utility in _testcapi to untrack instances for this purpose.

    @pablogsal
    Copy link
    Member

    I think I can just create a simple utility in _testcapi to untrack instances for this purpose.

    I have the feeling that I am misunderstanding what you refer to, but if is **untrack** what you need, there is gc.untrack()

    @brandtbucher
    Copy link
    Member Author

    there is gc.untrack()

    I'm not sure that's true... :)

    Although perhaps track()/untrack() functions *could* be useful to add to the gc module... but that's a separate conversation.

    @pablogsal
    Copy link
    Member

    I'm not sure that's true... :)

    Oh boy, I need to....ehem....yeah I will be back in....one second...

    @pablogsal
    Copy link
    Member

    In any case, you can expose:

    https://docs.python.org/3/c-api/gcsupport.html#c.PyObject_GC_UnTrack

    in the testcapi module

    @brandtbucher
    Copy link
    Member Author

    I'll hold off until https://bugs.python.org/issue42039 is resolved.

    @brandtbucher
    Copy link
    Member Author

    Actually, never mind. Because of the way finalization works, we need to create untracked *types* for these tests, not untracked *instances*.

    PR up soon.

    @brandtbucher
    Copy link
    Member Author

    New changeset c13b847 by Brandt Bucher in branch 'master':
    bpo-41984: GC track all user classes (GH-22701)
    c13b847

    @brandtbucher
    Copy link
    Member Author

    New changeset d197b2b by Miss Skeleton (bot) in branch '3.9':
    bpo-41984: GC track all user classes (GH-22701/GH-22702)
    d197b2b

    @brandtbucher
    Copy link
    Member Author

    New changeset aeb66c1 by Miss Skeleton (bot) in branch '3.8':
    bpo-41984: GC track all user classes (GH-22701/GH-22707)
    aeb66c1

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants