Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getsizeof() on code objects is wrong #56623

Closed
benjaminp opened this issue Jun 26, 2011 · 11 comments
Closed

getsizeof() on code objects is wrong #56623

benjaminp opened this issue Jun 26, 2011 · 11 comments
Labels
3.7 (EOL) end of life easy interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@benjaminp
Copy link
Contributor

BPO 12414
Nosy @loewis, @rhettinger, @pitrou, @vstinner, @benjaminp, @serhiy-storchaka, @corona10
PRs
  • bpo-12414: Update code_sizeof() to take in account co_extras #1168
  • [3.6] bpo-12414: code_sizeof() update. (GH-1168) #1198
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-04-20.08:27:07.286>
    created_at = <Date 2011-06-26.03:53:21.319>
    labels = ['interpreter-core', 'easy', 'type-bug', '3.7']
    title = 'getsizeof() on code objects is wrong'
    updated_at = <Date 2017-04-20.08:27:07.286>
    user = 'https://github.com/benjaminp'

    bugs.python.org fields:

    activity = <Date 2017-04-20.08:27:07.286>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-04-20.08:27:07.286>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2011-06-26.03:53:21.319>
    creator = 'benjamin.peterson'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 12414
    keywords = ['easy (C)']
    message_count = 11.0
    messages = ['139142', '139207', '290529', '290536', '290546', '290578', '291821', '291829', '291934', '291952', '291954']
    nosy_count = 7.0
    nosy_names = ['loewis', 'rhettinger', 'pitrou', 'vstinner', 'benjamin.peterson', 'serhiy.storchaka', 'corona10']
    pr_nums = ['1168', '1198']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue12414'
    versions = ['Python 3.6', 'Python 3.7']

    @benjaminp
    Copy link
    Contributor Author

    sys.getsizeof() on a code object returns on the size of the code struct, not the arrays and tuples which it references.

    @benjaminp benjaminp added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Jun 26, 2011
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jun 26, 2011

    For composite objects, getsizeof should only return the memory of the object itself, and not that of other objects it refers to. "the object itself" definitely includes the struct of the object, and also definitely includes non-PyObject blocks uniquely referred to by the object. It definitely should not return objects it reports in gc.get_referents. It probably should include PyObjects not shared with any other object, and not accessible from outside of the object.

    There are boundary cases, such as memory blocks which are not PyObject, but may be shared across objects, and PyObjects not reported in get_referents.

    It seems this case is the latter: the PyObjects are not returned from get_referents, but are certainly available to Python, e.g. through co_code and co_consts.

    I don't think there sizes should be added to the size of the PyObject, since otherwise accounting algorithms may account for it twice.

    What's your use case for including it in the total size?

    @serhiy-storchaka
    Copy link
    Member

    I concur with Martin. sys.getsizeof() should only count the memory that is not exposed as separate Python objects. In case of a code object this is the memory of the PyCodeObject structure and the memory of dynamic array co_cellvars (bpo-15456). Other subobjects are exposed as code object attributes and by gc.get_referents(). For counting the summary size you should recursively call sys.getsizeof() for objects returned by gc.get_referents(). But be aware that some subobjects (for example interned strings) can be shared between different code objects, so the average memory consumption is less than the simple sum.

    @rhettinger
    Copy link
    Contributor

    Not including the Python accessible referred-to objects is consistent with how sys.getsizeof() works elsewhere (i.e. for object instances, the size of __dict__ is not included).

        >>> import sys
        >>> class A:
                pass
    
        >>> a = A()
        >>> sys.getsizeof(a)
        56
        >>> sys.getsizeof(a.__dict__)
        112

    The result is easily misleading but this seems to have been an early design decision about the semanatics __sizeof__.

    @serhiy-storchaka
    Copy link
    Member

    See gettotalsizeof.py attached to bpo-19048. It contains two functions that calculates the total size of the object with subobjects recursively. The problem is that virtually all objects are linked, two functions use slightly different criteria for stopping.

    @vstinner
    Copy link
    Member

    code_sizeof() must be updated to take in account co_extra memory: co_extra.ce_size * sizeof(co_extra.ce_extras[0]) bytes.

    @corona10
    Copy link
    Member

    I am a newcomer who is interesting with contributing CPython project.
    This issue could be the first challenging issue for me.
    It looks like we need to multiply the number of co_extra and return the result.

    This approach is right and can I proceed this issue?

    @serhiy-storchaka
    Copy link
    Member

    This is easy issue. The only tricky part is testing. AFAIK we have no control on co_extra from Python. Therefore we can't create a new test. But existing test perhaps should be weaken for the case when it is ran on the interpreter that sets co_extra.

    @corona10
    Copy link
    Member

    I create PR 1168 and PR 1198 for master branch and the 3.6 branch.

    @serhiy-storchaka
    Copy link
    Member

    New changeset b4dc6af by Serhiy Storchaka (Dong-hee Na) in branch 'master':
    bpo-12414: Update code_sizeof() to take in account co_extra memory. (bpo-1168)
    b4dc6af

    @serhiy-storchaka
    Copy link
    Member

    New changeset df5df13 by Serhiy Storchaka (Dong-hee Na) in branch '3.6':
    [3.6] bpo-12414: Update code_sizeof() to take in account co_extra memory. (bpo-1168) (bpo-1198)
    df5df13

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life easy interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants