Message 84108 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	goddard
Recipients	goddard
Date	2009-03-24.19:48:41
SpamBayes Score	1.3535686e-07
Marked as misclassified	No
Message-id	<1237924124.82.0.478709056921.issue5557@psf.upfronthosting.co.za>
In-reply-to

Content
Bytecode compiling large Python files uses an unexpectedly large amount of memory. For example, compiling a file containing a list of 5 million integers uses about 2 Gbytes of memory while the Python file size is about 40 Mbytes. The memory used is 50 times the file size. The resulting list in Python consumes about 400 Mbytes of memory, so compiling the byte codes uses about 5 times the memory of the list object. Can the byte-code compilation can be made more memory efficient? The application that creates simlilarly large Python files is a molecular graphics program called UCSF Chimera that my lab develops. It writes session files which are Python code. Sessions of reasonable size for Chimera for a given amount of physical memory cannot be byte-compiled without thrashing, crippling the interactivity of all software running on the machine. Here is Python code to produce the test file test.py containing a list of 5 million integers: print >>open('test.py','w'), 'x = ', repr(range(5000000)) I tried importing the test.py file with Python 2.5, 2.6.1 and 3.0.1 on Mac OS 10.5.6. In each case when the test.pyc file is not present the python process as monitored by the unix "top" command took about 1.7 Gb RSS and 2.2 Gb VSZ on a MacBook Pro which has 2 Gb of memory.

Bytecode compiling large Python files uses an unexpectedly large amount
of memory.  For example, compiling a file containing a list of 5 million
integers uses about 2 Gbytes of memory while the Python file size is
about 40 Mbytes.  The memory used is 50 times the file size.  The
resulting list in Python consumes about 400 Mbytes of memory, so
compiling the byte codes uses about 5 times the memory of the list
object.  Can the byte-code compilation can be made more memory efficient?

The application that creates simlilarly large Python files is a
molecular graphics program called UCSF Chimera that my lab develops.  It
writes session files which are Python code.  Sessions of reasonable size
for Chimera for a given amount of physical memory cannot be
byte-compiled without thrashing, crippling the interactivity of all
software running on the machine.

Here is Python code to produce the test file test.py containing a list
of 5 million integers:

print >>open('test.py','w'), 'x = ', repr(range(5000000))

I tried importing the test.py file with Python 2.5, 2.6.1 and 3.0.1 on
Mac OS 10.5.6.  In each case when the test.pyc file is not present the
python process as monitored by the unix "top" command took about 1.7 Gb
RSS and 2.2 Gb VSZ on a MacBook Pro which has 2 Gb of memory.

History
Date	User	Action	Args
2009-03-24 19:48:44	goddard	set	recipients: + goddard
2009-03-24 19:48:44	goddard	set	messageid: <1237924124.82.0.478709056921.issue5557@psf.upfronthosting.co.za>
2009-03-24 19:48:43	goddard	link	issue5557 messages
2009-03-24 19:48:41	goddard	create