New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getsizeof() on code objects is wrong #56623
Comments
sys.getsizeof() on a code object returns on the size of the code struct, not the arrays and tuples which it references. |
For composite objects, getsizeof should only return the memory of the object itself, and not that of other objects it refers to. "the object itself" definitely includes the struct of the object, and also definitely includes non-PyObject blocks uniquely referred to by the object. It definitely should not return objects it reports in gc.get_referents. It probably should include PyObjects not shared with any other object, and not accessible from outside of the object. There are boundary cases, such as memory blocks which are not PyObject, but may be shared across objects, and PyObjects not reported in get_referents. It seems this case is the latter: the PyObjects are not returned from get_referents, but are certainly available to Python, e.g. through co_code and co_consts. I don't think there sizes should be added to the size of the PyObject, since otherwise accounting algorithms may account for it twice. What's your use case for including it in the total size? |
I concur with Martin. sys.getsizeof() should only count the memory that is not exposed as separate Python objects. In case of a code object this is the memory of the PyCodeObject structure and the memory of dynamic array co_cellvars (bpo-15456). Other subobjects are exposed as code object attributes and by gc.get_referents(). For counting the summary size you should recursively call sys.getsizeof() for objects returned by gc.get_referents(). But be aware that some subobjects (for example interned strings) can be shared between different code objects, so the average memory consumption is less than the simple sum. |
Not including the Python accessible referred-to objects is consistent with how sys.getsizeof() works elsewhere (i.e. for object instances, the size of __dict__ is not included). >>> import sys
>>> class A:
pass
>>> a = A()
>>> sys.getsizeof(a)
56
>>> sys.getsizeof(a.__dict__)
112 The result is easily misleading but this seems to have been an early design decision about the semanatics __sizeof__. |
See gettotalsizeof.py attached to bpo-19048. It contains two functions that calculates the total size of the object with subobjects recursively. The problem is that virtually all objects are linked, two functions use slightly different criteria for stopping. |
code_sizeof() must be updated to take in account co_extra memory: co_extra.ce_size * sizeof(co_extra.ce_extras[0]) bytes. |
I am a newcomer who is interesting with contributing CPython project. This approach is right and can I proceed this issue? |
This is easy issue. The only tricky part is testing. AFAIK we have no control on co_extra from Python. Therefore we can't create a new test. But existing test perhaps should be weaken for the case when it is ran on the interpreter that sets co_extra. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: