Message55785
Error in reading >4Go files under windows
try this:
import sys
print(sys.version_info)
import time
print (time.strftime('%Y-%m-%d %H:%M:%S'))
liste=[]
start = time.time()
fichout=open('test.txt','w')
for i in xrange(85014961):
if i%5000000==0 and i>0:
print (i,time.time()-start)
fichout.write(str(i)+' '*59+'\n')
fichout.close()
print ('total lines written ',i)
print (i,time.time()-start)
print ('*'*50)
fichin=open('test.txt')
start3 = time.time()
for i,li in enumerate(fichin):
if i%5000000==0 and i>0:
print (i,time.time()-start3)
fichin.close()
print ('total lines read ',i)
print(time.time()-start)
it generates a >4Go file,not all lines are read !!
example:
('total lines written ', 85014960)
('total lines read ', 85014950)
10 lines are missing
if you replace by
fichout.write(str(i)+' '*59+'\n')
file is now under 4Go, is properly read
Used both a 32 and 64 Windows XP machines
seems to work with Linux and BSD (did not tried this example but had no
pb with my home made big files)
Pb : many examples of >4Go files for the human genome and other
biological applications. Almost sure that people are doing mistakes,
because it took me a while before discovering that...
Note : does not happen with py 3k :-) |
|
| Date |
User |
Action |
Args |
| 2007-09-10 15:52:42 | Richard.Christen@unice.fr | set | spambayes_score: 0.0019012 -> 0.0019012 recipients:
+ Richard.Christen@unice.fr |
| 2007-09-10 15:52:42 | Richard.Christen@unice.fr | set | spambayes_score: 0.0019012 -> 0.0019012 messageid: <1189439562.54.0.204300359031.issue1142@psf.upfronthosting.co.za> |
| 2007-09-10 15:52:42 | Richard.Christen@unice.fr | link | issue1142 messages |
| 2007-09-10 15:52:41 | Richard.Christen@unice.fr | create | |
|