This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Eric.Wolf
Recipients Eric.Wolf, niemeyer, wrobell
Date 2011-03-01.01:25:33
SpamBayes Score 0.00021281
Marked as misclassified No
Message-id <>
I'm experiencing the same thing. My script works perfectly on a 165MB file but fails after reading 900,000 bytes on a 22GB file.

My script uses a buffered and is agnostic about end-of-lines. Opening with "rb" does not help. It is specifically written to avoid reading too much into memory at once.

I have tested this script on:
Python 2.5.1 (r251:54863) (ESRI ArcGIS version) (WinXP 64-bit)
Python (r271:86832) (64-bit ActiveState version) (WinXP 64-bit)
Python 2.6.4 (r264:75706) (Ubuntu 9.10 64-bit)

Check here for some really big BZ2 files:
Date User Action Args
2011-03-01 01:25:36Eric.Wolfsetrecipients: + Eric.Wolf, niemeyer, wrobell
2011-03-01 01:25:35Eric.Wolfsetmessageid: <>
2011-03-01 01:25:35Eric.Wolflinkissue10900 messages
2011-03-01 01:25:35Eric.Wolfcreate