classification
Title: Better cache instrumentation
Type: performance Stage:
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: ncoghlan, pitrou, rhettinger
Priority: low Keywords: patch

Created on 2010-12-17 09:48 by rhettinger, last changed 2010-12-18 07:04 by rhettinger. This issue is now closed.

Files
File name Uploaded Description Edit
sized_cache.diff rhettinger, 2010-12-17 09:48 size of cache in kilobytes
Messages (7)
msg124194 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-12-17 09:48
Nick, can you look at this?
msg124202 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-12-17 13:11
The _total_size thing looks like a wildly bad idea to me, since it's so poorly defined (and relying on a couple of special cases).

Also, "currsize" is quite bizarre. Why not simply "size"?
msg124236 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-12-17 17:47
Updated to use ABCs but still relies on user objects implementing __sizeof__.  So it is accurate whenever sys.getsizeof() is accurate.
msg124241 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-12-17 18:02
Another thing to work out:  not double counting duplicate objects: 
 
   [1000, 2000, 3000] is bigger than [None, None, None]
msg124242 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-12-17 18:06
> Updated to use ABCs but still relies on user objects implementing
> __sizeof__.  So it is accurate whenever sys.getsizeof() is accurate.

I'm really -1 on this. It's better to give no measurement than to give a
totally wrong indication. The fact that you had to special-case
containers shows that getsizeof() is *not* the right solution for this.

(because if it was, you could rely on the containers' __sizeof__ instead
of overriding it)

The "size of an object and its contents" generally doesn't make any
sense, because the object graph is massively connected. getsizeof()
gives you the basic internal footprint of the object, but that's all.
It's really not a high-level tool for the everyday programmer, and
relying on it to say how much memory would be saved by getting rid of
certain objects is bogus.
msg124274 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-12-18 06:11
Indeed, getsizeof() on containers only gives the amount of memory consumed by the container itself (this can be difficult to figure out externally for potentially sparse containers like dicts), but leaves the question of the size of the referenced objects open.

Exposing the result of sys.getsizeof() on the cache may not be a bad idea, but the question would be which use cases are handled by that additional information which aren't adequately covered by the simple count of the cache entries.
msg124275 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2010-12-18 07:04
Closed due to lack of interest.
History
Date User Action Args
2010-12-18 07:04:52rhettingersetstatus: open -> closed

messages: + msg124275
resolution: rejected
2010-12-18 06:11:37ncoghlansetmessages: + msg124274
2010-12-17 19:39:28rhettingersetfiles: - sized_cache2.diff
2010-12-17 18:06:54pitrousetmessages: + msg124242
2010-12-17 18:02:13rhettingersetpriority: normal -> low

messages: + msg124241
2010-12-17 17:47:21rhettingersetfiles: + sized_cache2.diff

messages: + msg124236
2010-12-17 13:11:54pitrousetnosy: + pitrou
messages: + msg124202
2010-12-17 09:48:31rhettingercreate