This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Eric.Wolf
Recipients Eric.Wolf, niemeyer, wrobell
Date 2011-03-01.01:25:33
SpamBayes Score 0.00021281028
Marked as misclassified No
Message-id <1298942735.99.0.159879008174.issue10900@psf.upfronthosting.co.za>
In-reply-to
Content
I'm experiencing the same thing. My script works perfectly on a 165MB file but fails after reading 900,000 bytes on a 22GB file.

My script uses a buffered bz2file.read and is agnostic about end-of-lines. Opening with "rb" does not help. It is specifically written to avoid reading too much into memory at once.

I have tested this script on:
Python 2.5.1 (r251:54863) (ESRI ArcGIS version) (WinXP 64-bit)
Python 2.7.1.4 (r271:86832) (64-bit ActiveState version) (WinXP 64-bit)
Python 2.6.4 (r264:75706) (Ubuntu 9.10 64-bit)

Check here for some really big BZ2 files:

http://planet.openstreetmap.org/full-experimental/
History
Date User Action Args
2011-03-01 01:25:36Eric.Wolfsetrecipients: + Eric.Wolf, niemeyer, wrobell
2011-03-01 01:25:35Eric.Wolfsetmessageid: <1298942735.99.0.159879008174.issue10900@psf.upfronthosting.co.za>
2011-03-01 01:25:35Eric.Wolflinkissue10900 messages
2011-03-01 01:25:35Eric.Wolfcreate