Message120385
It turns out this isn't just a problem with array.array. It's a problem with Python's file.write() as well. Here's my test code:
# file.write() test:
FOURGBMINUS = 2**32 - 16
s = '0123456789012345' # 16 bytes
longs = ''.join([s for i in xrange(FOURGBMINUS//len(s))])
assert len(longs) == FOURGBMINUS
f = open('test.txt', 'w')
f.write(longs) # completes successfully
f.close()
FOURGB = 2**32
s = '0123456789012345' # 16 bytes
longs = ''.join([s for i in xrange(FOURGB//len(s))])
assert len(longs) == FOURGB
f = open('test.txt', 'w')
f.write(longs) # hangs with 100% CPU, file is 0 bytes
f.close()
SIXGB = 2**32 + 2**31
s = '0123456789012345' # 16 bytes
longs = ''.join([s for i in xrange(SIXGB//len(s))])
assert len(longs) == SIXGB
f = open('test.txt', 'w')
f.write(longs) # hangs with 100% CPU, file is 2**31 bytes
f.close()
# file.read test:
TWOGB = 2**31
TWOGBPLUS = TWOGB + 16
s = '0123456789012345' # 16 bytes
longs = ''.join([s for i in xrange(TWOGBPLUS//len(s))])
assert len(longs) == TWOGBPLUS
f = open('test.txt', 'w')
f.write(longs) # completes successfully
f.close()
f = open('test.txt', 'r')
longs = f.read() # works, but takes >30 min, memory usage keeps jumping around
f.close()
del longs
# maybe f.read() reads 1 char at a time til it hits EOL. try this instead:
f = open('test.txt', 'r')
longs = f.read(TWOGBPLUS) # OverflowError: long int too large to convert to int
longs = f.read(TWOGB) # OverflowError: long int too large to convert to int
longs = f.read(TWOGB - 1) # works, takes only seconds
f.close()
So, I guess in windows (I've only tested in 64-bit Windows 7, Python 2.6.6 amd64), file.write() should call fwrite multiple times in chunks no greater than 2**31 bytes or so. Also, calling f.read(nbytes) where nbytes >= 2**31 raises "OverflowError: long int too large to convert to int". I don't have either of these problems in 64-bit Linux (Ubuntu 10.10) on the same machine (i7, 12GB). |
|
Date |
User |
Action |
Args |
2010-11-04 08:49:52 | mspacek | set | recipients:
+ mspacek, cgohlke, Bill.Steinmetz |
2010-11-04 08:49:52 | mspacek | set | messageid: <1288860592.3.0.670484449031.issue9015@psf.upfronthosting.co.za> |
2010-11-04 08:49:50 | mspacek | link | issue9015 messages |
2010-11-04 08:49:49 | mspacek | create | |
|