This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: getsizeof() on code objects is wrong
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, corona10, loewis, pitrou, rhettinger, serhiy.storchaka, vstinner
Priority: normal Keywords: easy (C)

Created on 2011-06-26 03:53 by benjamin.peterson, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 1168 merged corona10, 2017-04-18 08:47
PR 1198 merged corona10, 2017-04-20 02:32
Messages (11)
msg139142 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-06-26 03:53
sys.getsizeof() on a code object returns on the size of the code struct, not the arrays and tuples which it references.
msg139207 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-06-26 20:04
For composite objects, getsizeof should only return the memory of the object itself, and not that of other objects it refers to. "the object itself" definitely includes the struct of the object, and also definitely includes non-PyObject blocks uniquely referred to by the object. It definitely should not return objects it reports in gc.get_referents. It probably should include PyObjects not shared with any other object, and not accessible from outside of the object.

There are boundary cases, such as memory blocks which are not PyObject, but may be shared across objects, and PyObjects not reported in get_referents.

It seems this case is the latter: the PyObjects are not returned from get_referents, but are certainly available to Python, e.g. through co_code and co_consts.

I don't think there sizes should be added to the size of the PyObject, since otherwise accounting algorithms may account for it twice.

What's your use case for including it in the total size?
msg290529 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-26 13:49
I concur with Martin. sys.getsizeof() should only count the memory that is not exposed as separate Python objects. In case of a code object this is the memory of the PyCodeObject structure and the memory of dynamic array co_cellvars (issue15456). Other subobjects are exposed as code object attributes and by gc.get_referents(). For counting the summary size you should recursively call sys.getsizeof() for objects returned by gc.get_referents(). But be aware that some subobjects (for example interned strings) can be shared between different code objects, so the average memory consumption is less than the simple sum.
msg290536 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-03-26 16:02
Not including the Python accessible referred-to objects is consistent with how sys.getsizeof() works elsewhere (i.e. for object instances, the size of __dict__ is not included).

    >>> import sys
    >>> class A:
            pass

    >>> a = A()
    >>> sys.getsizeof(a)
    56
    >>> sys.getsizeof(a.__dict__)
    112

The result is easily misleading but this seems to have been an early design decision about the semanatics __sizeof__.
msg290546 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-26 18:32
See gettotalsizeof.py attached to issue19048. It contains two functions that calculates the total size of the object with subobjects recursively. The problem is that virtually all objects are linked, two functions use slightly different criteria for stopping.
msg290578 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2017-03-27 10:48
code_sizeof() must be updated to take in account co_extra memory: co_extra.ce_size * sizeof(co_extra.ce_extras[0]) bytes.
msg291821 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2017-04-18 05:10
I am a newcomer who is interesting with contributing CPython project.
This issue could be the first challenging issue for me.
It looks like we need to multiply the number of co_extra and return the result.

This approach is right and can I proceed this issue?
msg291829 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-18 07:38
This is easy issue. The only tricky part is testing. AFAIK we have no control on co_extra from Python. Therefore we can't create a new test. But existing test perhaps should be weaken for the case when it is ran on the interpreter that sets co_extra.
msg291934 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2017-04-20 02:41
I create PR 1168 and PR 1198 for master branch and the 3.6 branch.
msg291952 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-20 07:31
New changeset b4dc6af7a7862a8996cffed30d39d6add5ee58a3 by Serhiy Storchaka (Dong-hee Na) in branch 'master':
bpo-12414: Update code_sizeof() to take in account co_extra memory. (#1168)
https://github.com/python/cpython/commit/b4dc6af7a7862a8996cffed30d39d6add5ee58a3
msg291954 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-04-20 08:26
New changeset df5df13fdc3a71bcf2295acc2cba7f22cfe2d669 by Serhiy Storchaka (Dong-hee Na) in branch '3.6':
[3.6] bpo-12414: Update code_sizeof() to take in account co_extra memory. (#1168) (#1198)
https://github.com/python/cpython/commit/df5df13fdc3a71bcf2295acc2cba7f22cfe2d669
History
Date User Action Args
2022-04-11 14:57:19adminsetgithub: 56623
2017-04-20 08:27:07serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: needs patch -> resolved
2017-04-20 08:26:28serhiy.storchakasetmessages: + msg291954
2017-04-20 07:31:19serhiy.storchakasetmessages: + msg291952
2017-04-20 02:41:30corona10setmessages: + msg291934
2017-04-20 02:32:19corona10setpull_requests: + pull_request1323
2017-04-18 08:47:26corona10setpull_requests: + pull_request1298
2017-04-18 07:38:59serhiy.storchakasetkeywords: + easy (C)

stage: needs patch
messages: + msg291829
versions: + Python 3.6, Python 3.7, - Python 3.2, Python 3.3
2017-04-18 05:10:26corona10setnosy: + corona10
messages: + msg291821
2017-03-27 10:48:10vstinnersetnosy: + vstinner
messages: + msg290578
2017-03-26 18:32:36serhiy.storchakasetmessages: + msg290546
2017-03-26 16:02:43rhettingersetnosy: + pitrou
messages: + msg290536
2017-03-26 13:49:22serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg290529
2011-07-02 13:57:42eric.araujosetnosy: + rhettinger
2011-06-26 20:04:02loewissetnosy: + loewis
messages: + msg139207
2011-06-26 03:53:21benjamin.petersoncreate