Author serhiy.storchaka
Recipients christian.heimes, gregory.p.smith, kristjan.jonsson, loewis, mark.dickinson, pitrou, serhiy.storchaka
Date 2012-11-20.18:41:33
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1353436893.39.0.132234501564.issue16475@psf.upfronthosting.co.za>
In-reply-to
Content
Here is the statistics for all pyc-files (not only in Lib/__pycache__). This includes encoding tables and tests. I count also memory usage for some types (for tuples shared size is estimated upper limit).

type             count %          size   shared %

UNICODE         622812 58%    26105085 14885090 57%
TUPLE           224214 21%     8184848  3498300 43%
STRING           90992 8.4%    6931342   832224 12%
INT              52087 4.8%     715400    58666 8.2%
CODE             42147 3.9%    2865996        0 0%
NONE             39777 3.7%  
BINARY_FLOAT      3120 0.29% 
TRUE              2363 0.22% 
FALSE             1976 0.18% 
LONG              1012 0.094%
ELLIPSIS           528 0.049%
BINARY_COMPLEX     465 0.043%
FROZENSET           24 0.0022%

Total          1081517 100%   44802671 19274280 ~43%


Therefore there is a sense to share unicode objects, tuples, and may be bytes objects. Most integers (in range -5..257) already interned. None of code objects can be shared (because code object contains almost unique first line number). Floats, complexes and frozensets unlikely save much of memory.
History
Date User Action Args
2012-11-20 18:41:33serhiy.storchakasetrecipients: + serhiy.storchaka, loewis, gregory.p.smith, mark.dickinson, pitrou, kristjan.jonsson, christian.heimes
2012-11-20 18:41:33serhiy.storchakasetmessageid: <1353436893.39.0.132234501564.issue16475@psf.upfronthosting.co.za>
2012-11-20 18:41:33serhiy.storchakalinkissue16475 messages
2012-11-20 18:41:33serhiy.storchakacreate