Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

defining persistent_id in _pickle.Pickler subclass causes reference cycle #72602

Closed
cwitty mannequin opened this issue Oct 11, 2016 · 4 comments
Closed

defining persistent_id in _pickle.Pickler subclass causes reference cycle #72602

cwitty mannequin opened this issue Oct 11, 2016 · 4 comments
Assignees
Labels
3.7 (EOL) end of life extension-modules C modules in the Modules dir performance Performance or resource usage

Comments

@cwitty
Copy link
Mannequin

cwitty mannequin commented Oct 11, 2016

BPO 28416
Nosy @avassalotti, @serhiy-storchaka
PRs
  • bpo-28416: Break reference cycles in Pickler and Unpickler subclasses #4080
  • [3.6] bpo-28416: Break reference cycles in Pickler and Unpickler subclasses (GH-4080) #4653
  • Files
  • pickle_reference_cycle.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2018-01-05.17:34:05.340>
    created_at = <Date 2016-10-11.16:14:11.736>
    labels = ['extension-modules', '3.7', 'performance']
    title = 'defining persistent_id in _pickle.Pickler subclass causes reference cycle'
    updated_at = <Date 2018-01-05.17:34:05.339>
    user = 'https://bugs.python.org/cwitty'

    bugs.python.org fields:

    activity = <Date 2018-01-05.17:34:05.339>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2018-01-05.17:34:05.340>
    closer = 'serhiy.storchaka'
    components = ['Extension Modules']
    creation = <Date 2016-10-11.16:14:11.736>
    creator = 'cwitty'
    dependencies = []
    files = ['45058']
    hgrepos = []
    issue_num = 28416
    keywords = ['patch']
    message_count = 4.0
    messages = ['278495', '304786', '307341', '307345']
    nosy_count = 3.0
    nosy_names = ['alexandre.vassalotti', 'cwitty', 'serhiy.storchaka']
    pr_nums = ['4080', '4653']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue28416'
    versions = ['Python 3.6', 'Python 3.7']

    @cwitty
    Copy link
    Mannequin Author

    cwitty mannequin commented Oct 11, 2016

    On creation, _pickle.Pickler caches any .persistent_id() method defined by a subclass (in the pers_func field of PicklerObject). This causes a reference cycle (pickler -> bound method of pickler -> pickler), so the pickler is held in memory until the next cycle collection. (Then, because of the pickler's memo table, any objects that this pickler has pickled are also held until the next cycle collection.)

    Looking at the source code, it looks like the same thing would happen with _pickle.Unpickler and .persistent_load(), but I haven't tested it. Any fix should be applied to both classes.

    I've attached a test file; when I run it with "python3 pickle_reference_cycle.py", all 3 print statements are executed. I would prefer it if "Oops, still here" was not printed. (I'm using Debian's python3.5 package, version 3.5.2-4 for amd64, but I believe the problem occurs across many versions of python3, looking at the history of _pickle.c.)

    I don't see how to fix the problem with no performance impact. (Setting pers_func at the beginning of dump() and clearing it at the end would have approximately the same performance in the common case that only one object was dumped per pickler, but would be slower when dumping multiple objects.) If you decide not to fix the problem, could you at least describe the problem and a workaround in the documentation?

    @cwitty cwitty mannequin added type-bug An unexpected behavior, bug, or error extension-modules C modules in the Modules dir labels Oct 11, 2016
    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life performance Performance or resource usage and removed type-bug An unexpected behavior, bug, or error labels Oct 11, 2016
    @serhiy-storchaka serhiy-storchaka self-assigned this Oct 22, 2017
    @serhiy-storchaka
    Copy link
    Member

    PR 4080 converts bound methods into unbound methods if possible.

    @serhiy-storchaka
    Copy link
    Member

    New changeset 986375e by Serhiy Storchaka in branch 'master':
    bpo-28416: Break reference cycles in Pickler and Unpickler subclasses (bpo-4080)
    986375e

    @serhiy-storchaka
    Copy link
    Member

    New changeset c91bf74 by Serhiy Storchaka (Miss Islington (bot)) in branch '3.6':
    bpo-28416: Break reference cycles in Pickler and Unpickler subclasses (GH-4080) (bpo-4653)
    c91bf74

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life extension-modules C modules in the Modules dir performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant