Message91211
The new io.BufferedRandom implementation in Python 3.1 has a broken seek
that seems not to properly handle the case when the target of the seek
lies inside the contents of the file buffer. It leaves the file object
in a confused state, such that the next write operation applies after
the end of the buffer(!) instead of the specified target.
I could reproduce the following symptoms on both Debian Lenny and Mac OS
X Leopard. I downloaded the Python 3.1 tarball from python.org, and
built it by hand using './configure && make'.
$ ./python.exe
Python 3.1 (r31:73572, Aug 3 2009, 02:32:10)
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> open("test", "wb").write(b"A" * 10000)
10000
>>> file = open("test", "rb+")
>>> file.read(10) # Reads 4096 bytes into file buffer
b'AAAAAAAAAA'
>>> file.tell()
10
>>> file.seek(0)
0
>>> file.tell()
0
>>> file.write(b"B" * 10000) # This should overwrite the whole file
10000
>>> file.tell()
14096 # Hmm, 0 + 10000 == 14096?
>>> file.close()
>>> d = open("test", "rb").read()
>>> len(d)
14096 # ?!
>>> d[0:10] # The file should now consist of 10000 Bs...
b'AAAAAAAAAA'
>>> d[4090:4100]
b'AAAAAABBBB' # ... but the Bs only start after a buffer's worth of
As.
This bug has actually caused me some subtle, silent data corruption that
went undetected for quite a while. Hurray for backups!
The above code works fine in Python 3.0, and its Python 2.5 port also
produces correct results.
A workaround for 3.1 is to call flush before every seek. |
|
Date |
User |
Action |
Args |
2009-08-03 02:00:41 | lorentey | set | recipients:
+ lorentey |
2009-08-03 02:00:41 | lorentey | set | messageid: <1249264841.18.0.232875994092.issue6629@psf.upfronthosting.co.za> |
2009-08-03 02:00:39 | lorentey | link | issue6629 messages |
2009-08-03 02:00:38 | lorentey | create | |
|