Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finalizer resurrection in gc #82560

Closed
tim-one opened this issue Oct 5, 2019 · 13 comments
Closed

finalizer resurrection in gc #82560

tim-one opened this issue Oct 5, 2019 · 13 comments
Assignees
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@tim-one
Copy link
Member

tim-one commented Oct 5, 2019

BPO 38379
PRs
  • bpo-38379: don't claim objects are collected when they aren't #16658
  • [3.8] bpo-38379: don't claim objects are collected when they aren't (GH-16658) #16683
  • [3.7] bpo-38379: don't claim objects are collected when they aren't (GH-16658) #16685
  • bpo-38379: Don't block collection of unreachable objects when some objects resurrect #16687
  • Files
  • zleak.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/terryjreedy'
    closed_at = <Date 2019-10-13.22:17:33.437>
    created_at = <Date 2019-10-05.17:31:26.101>
    labels = ['interpreter-core', '3.8', 'type-bug', '3.7', '3.9']
    title = 'finalizer resurrection in gc'
    updated_at = <Date 2021-11-04.14:28:28.082>
    user = 'https://github.com/tim-one'

    bugs.python.org fields:

    activity = <Date 2021-11-04.14:28:28.082>
    actor = 'eryksun'
    assignee = 'terry.reedy'
    closed = True
    closed_date = <Date 2019-10-13.22:17:33.437>
    closer = 'tim.peters'
    components = ['Interpreter Core']
    creation = <Date 2019-10-05.17:31:26.101>
    creator = 'tim.peters'
    dependencies = []
    files = ['48644']
    hgrepos = []
    issue_num = 38379
    keywords = ['patch']
    message_count = 13.0
    messages = ['354020', '354041', '354213', '354290', '354291', '354296', '354297', '354299', '354589', '354605', '375518', '375520', '375521']
    nosy_count = 0.0
    nosy_names = []
    pr_nums = ['16658', '16683', '16685', '16687']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue38379'
    versions = ['Python 3.7', 'Python 3.8', 'Python 3.9']

    @tim-one
    Copy link
    Member Author

    tim-one commented Oct 5, 2019

    While people are thinking about gc, zleak.py shows a small bug, and a possible opportunity for improvement, in the way gc treats finalizers that resurrect objects.

    The bug: the stats keep claiming gc is collecting an enormous number of objects, but in fact it's not collecting any. Objects in the unreachable set shouldn't add to the "collected" count unless they _are_ collected.

    Output:

    resurrecting
    collect 2000002
    gen 2 stats {'collections': 2, 'collected': 2000002, 'uncollectable': 0}
    resurrecting
    collect 4000004
    gen 2 stats {'collections': 3, 'collected': 6000006, 'uncollectable': 0}
    resurrecting
    collect 6000006
    gen 2 stats {'collections': 4, 'collected': 12000012, 'uncollectable': 0}
    resurrecting
    collect 8000008
    gen 2 stats {'collections': 5, 'collected': 20000020, 'uncollectable': 0}
    resurrecting
    collect 10000010
    gen 2 stats {'collections': 6, 'collected': 30000030, 'uncollectable': 0}
    ...

    Memory use grows without bound, and collections take ever longer.

    The opportunity: if any finalizer resurrects anything, gc gives up. But the process of computing whether anything was resurrected also determines which initially-trash objects are reachable from the risen dead. Offhand I don't see why we couldn't proceed collecting what remains trash. Then zleak.py would reclaim everything instead of nothing.

    Sketch: rather than just set a flag, check_garbage() could move now-reachable objects to the old generation (and, for each one moved, decrement the count of collected objects). Then delete_garbage() could proceed on what remains in the unreachable list.

    @tim-one tim-one added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.9 only security fixes type-bug An unexpected behavior, bug, or error labels Oct 5, 2019
    @tim-one
    Copy link
    Member Author

    tim-one commented Oct 6, 2019

    Just noting that check_garbage() currently only determines which trash objects are now directly reachable from outside. To be usable for the intended purpose, it would need to go on to compute which trash objects are reachable from those too.

    Maybe a new function

    static void
    deduce_unreachable(PyGC_Head *base, PyGC_Head *unreachable)

    that packaged the steps through move_unreachable(). The main function would use that on young, and check_garbage() on unreachable to find the still unreachable objects.

    I don't care about the expense. Outside of zleak.py, the number of unreachable objects is usually small, so it's usually cheap to make another pass over just them.

    @tim-one
    Copy link
    Member Author

    tim-one commented Oct 8, 2019

    PR 16658 aims to repair the stats reported.

    @tim-one
    Copy link
    Member Author

    tim-one commented Oct 9, 2019

    New changeset ecbf35f by Tim Peters in branch 'master':
    bpo-38379: don't claim objects are collected when they aren't (bpo-16658)
    ecbf35f

    @tim-one
    Copy link
    Member Author

    tim-one commented Oct 9, 2019

    I checked the stat fix into master, but GH failed to backport to 3.7 or 3.8 and I'm clueless. More info in the PR. Does someone else here know how to get a backport done?

    @tim-one tim-one added 3.7 (EOL) end of life 3.8 only security fixes labels Oct 9, 2019
    @pablogsal
    Copy link
    Member

    Tim, I have created backports for 3.8 and 3.7 (PR 16683, PR 16685). In my case cherry_picker ecbf35f 3.7 and cherry_picker ecbf35f 3.8 works after fixing merge conflicts. Maybe there is something going on in your git repo (maybe you didn't fetch from master first and that is why git log fails?).

    @miss-islington
    Copy link
    Contributor

    New changeset 0bd9fac by Miss Islington (bot) (Pablo Galindo) in branch '3.8':
    [3.8] bpo-38379: don't claim objects are collected when they aren't (GH-16658) (GH-16683)
    0bd9fac

    @miss-islington
    Copy link
    Contributor

    New changeset a081e93 by Miss Islington (bot) (Pablo Galindo) in branch '3.7':
    [3.7] bpo-38379: don't claim objects are collected when they aren't (GH-16658) (GH-16685)
    a081e93

    @pablogsal
    Copy link
    Member

    New changeset 466326d by Pablo Galindo in branch 'master':
    bpo-38379: Don't block collection of unreachable objects when some objects resurrect (GH-16687)
    466326d

    @tim-one
    Copy link
    Member Author

    tim-one commented Oct 13, 2019

    Everything here has been addressed, so closing this. zleak.py can apparently run forever now without leaking a byte :-)

    @tim-one tim-one closed this as completed Oct 13, 2019
    @LewisGaul
    Copy link
    Mannequin

    LewisGaul mannequin commented Aug 16, 2020

    I noticed this bug is mentioned in the 3.9 release notes with a note similar to the title of the 4th PR: "garbage collection does not block on resurrected objects".

    I can't see any mention of a blocking issue here on the issue:

    The bug: the stats keep claiming gc is collecting an enormous number of objects, but in fact it's not collecting any. Objects in the unreachable set shouldn't add to the "collected" count unless they _are_ collected.

    Would someone be able to elaborate on the blocking issue that was fixed as part of this BPO?

    @tim-one
    Copy link
    Member Author

    tim-one commented Aug 16, 2020

    I suspect you're reading some specific technical meaning into the word "block" that the PR and release note didn't intend by their informal use of the word. But I'm unclear on what technical meaning you have in mind.

    Before the change, gc "just gave up" after seeing a resurrection, ending the then-current cyclic gc run. It that sense, yes, resurrection "blocked" gc from making progress. It did not, e.g., "block" the interpreter in the sense of deadlock, or of waiting for some lock to be released, or of waiting for a network request to respond, or ...

    @LewisGaul
    Copy link
    Mannequin

    LewisGaul mannequin commented Aug 16, 2020

    You're right that's how I had interpreted it, thanks for clarifying.

    I was wondering if this could be related to an issue I've hit with gc.collect() getting slower and slower in a test suite, but that now seems unlikely, so I won't go into that here.

    @ahmedsayeed1982 ahmedsayeed1982 mannequin added topic-IDLE and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes labels Nov 4, 2021
    @eryksun eryksun added interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes and removed topic-IDLE labels Nov 4, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants