This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rhettinger
Recipients amaury.forgeotdarc, loewis, pitrou, rhettinger, serhiy.storchaka
Date 2013-09-21.00:12:25
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1379722346.42.0.89883008422.issue19048@psf.upfronthosting.co.za>
In-reply-to
Content
> getsizeof() is interesting only if it gives sensible results 
> when used correctly, especially if you want to sum these values
> and get a global memory usage.

If accounting for global memory usage is a goal, it needs to have a much more comprehensively thought out, implementation dependent approach.  There are many issues (memory fragmentation, key-sharing dictionaries, dummy objects, list over-allocation, the minsize dictionary that is part of the dict object in addition to its variable sized portion, non-python objects held by Python objects, the extra few bytes per object consumed by the freelisting scheme in Objects/obmalloc.c etc).

> The thing is, "Total size" is generally meaningless. 

I concur.  This is a pipe dream without a serious investment of time and without creating a new and unnecessary maintenance burden.

> (By the way, OrderedDict.__sizeof__ already breaks the
> rule you are trying to impose)

FWIW, the way OrderedDict computes sizeof is probably typical of how anyone is currently using sys.getsizeof().   If you change the premise of how it operates, you're probably going to break the code written by the very few people in the world who care about sys.getsizeof():

    def __sizeof__(self):
        sizeof = _sys.getsizeof
        n = len(self) + 1                       # number of links including root
        size = sizeof(self.__dict__)            # instance dictionary
        size += sizeof(self.__map) * 2          # internal dict and inherited dict
        size += sizeof(self.__hardroot) * n     # link objects
        size += sizeof(self.__root) * n         # proxy objects
        return size

I don't have any specific recommendation for itertools.tee other than that I think it doesn't really need a __sizeof__ method.  The typical uses of tee are a transient phenomena that temporarily use some memory and then disappear.  I'm not sure that any mid-stream sizeof checks reveal information of any worth.

Overall, this thread indicates that the entire concept of __sizeof__ has been poorly defined, unevenly implemented, and not really useful when aggregated.

For those who are interested in profiling and optimizing Python's memory usage, I think we would be much better off providing a memory allocator hook that can know about every memory allocation and how those allocations have been arranged (revealing the fragmentation of the unused memory in the spaces between).  Almost anything short of that will provide a grossly misleading picture of memory usage.
History
Date User Action Args
2013-09-21 00:12:26rhettingersetrecipients: + rhettinger, loewis, amaury.forgeotdarc, pitrou, serhiy.storchaka
2013-09-21 00:12:26rhettingersetmessageid: <1379722346.42.0.89883008422.issue19048@psf.upfronthosting.co.za>
2013-09-21 00:12:26rhettingerlinkissue19048 messages
2013-09-21 00:12:25rhettingercreate