This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Memory Error
Type: compile error Stage:
Components: None Versions: Python 2.6
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: freakcycle, mark.dickinson
Priority: normal Keywords:

Created on 2010-07-06 14:14 by freakcycle, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (3)
msg109392 - (view) Author: Peter Wolf (freakcycle) Date: 2010-07-06 14:14
I am using Ubuntu 10.04 32 bit  and python 2.6.When I type the following line in a terminal

>python mydatafile.py

I get the following error message on the next line

MemoryError

That is all.


File details : 

It is a 2d list of floating point numbers 86Mb in size.
Here is the start ->

mydata=[[1.51386,1.51399,1.51386,1.51399],
[1.51386,1.51401,1.51401,1.51386],
[1.51391,1.51406,1.51395,1.51401],
[1.51392,1.514,1.51397,1.51395],
[1.51377,1.5142,1.51387,1.51397],

here is the end ->

[1.5631,1.5633,1.5631,1.5631],
[1.5631,1.5632,1.5631,1.5631],
[1.5631,1.5633,1.5631,1.5631],
[1.563,1.5631,1.5631,1.5631]]


I will add that exactly the same type of file but 49MB in size compiled 
with 1GB of ram although there was a lot of disk activity and the CPU seemed to be working very hard.The 86MB file produced the above error.I upgraded to 3.4GB and still the same error.
msg109394 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-07-06 14:30
Thanks for the extra information; that helps a lot.

I think this is expected behaviour:  Python really does need that much memory to parse the file (as a Python file).  Partly this is because Python objects actually do take up a fair amount of space:  a length-4 list of floats on my (64-bit) machine takes 200 bytes, though on 32-bit machine this number should be a bit smaller.  But partly it's that the compilation stage itself uses a lot of memory:  for example, each of the floats in your input gets put into a dict during compilation;  this dict is used to recognize multiple references to the same float, so that only one float object needs to be created for each distinct float value.  And those dicts are going to get pretty big.

I don't think that storing huge amounts of data in a .py file like this is usual practice, so I'm not particularly worried that importing a huge .py file can cause a MemoryError.

For your case, I'd suggest parsing your datafile manually:  reading the file line by line from within Python.

Suggest closing this issue as "won't fix".
msg109395 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2010-07-06 14:51
Just an additional note:  have you considered using the pickle or json modules?
History
Date User Action Args
2022-04-11 14:57:03adminsetgithub: 53426
2010-08-13 20:08:28mark.dickinsonsetstatus: pending -> closed
2010-07-06 14:51:18mark.dickinsonsetstatus: open -> pending
resolution: wont fix
messages: + msg109395
2010-07-06 14:30:16mark.dickinsonsetnosy: + mark.dickinson
messages: + msg109394
2010-07-06 14:14:52freakcyclecreate