msg85286 - (view) |
Author: Robert Xiao (nneonneo) * |
Date: 2009-04-03 01:24 |
(tested and verified on Windows and Solaris SPARC)
Running this code in Python 2.4, 2.5 or 2.6 (all minor versions)
produces garbage.
f=open("anyfile","w")
f.write("garbage")
f.readline()
Mac OS X and Linux appear to simply throw an "IOError: [Errno 9] Bad
file descriptor" exception, while using a read method without writing
first produces the same exception on Windows and certain versions under
Solaris.
Under Solaris, it is further possible to segfault the interpreter with
this code:
f=open("anyfile","w")
f.read()
In the former case, it appears as if the data is simply read from the
disk block containing the file. In the latter, I have no clue what is
going on.
In Python 3.0, file objects opened with "w" don't even support any .read
methods, so this does not affect Py3k.
|
msg85287 - (view) |
Author: Robert Xiao (nneonneo) * |
Date: 2009-04-03 01:34 |
OK, it's not a memory leak, rather, it's something leaking from the
files. However, this still winds up filling the affected file full of
garbage, which is quite undesirable (and, I think, quite likely to be
unwanted behaviour, considering that *some* read methods throw IOError 9
and some others don't).
|
msg98577 - (view) |
Author: pmp-p (pmpp) * |
Date: 2010-01-30 22:24 |
related:
f=open('test','w+b')
f.write('123456789ABCDEF')
#f.seek(0)
print "position",f.tell()
print '[',len(f.read()),']'
f.close()
windows: 2.6.2 mingw/ 2.6.4 msvc builds
len(f.read()) > 0 ( 4081 when i did the test ) but should be 0
printing the buffer show binary garbage.
linux is ok f.read() return '' as expected
|
msg98582 - (view) |
Author: Robert Xiao (nneonneo) * |
Date: 2010-01-30 23:40 |
It seems like this is actually a problem in Windows libc or something (tested using MinGW on Windows XP):
#include <stdio.h>
main() {
FILE *f = fopen("test", "wb");
fwrite("test", 1, 4, f);
char buf[2048];
size_t k = fread(buf, 1, 2048, f);
printf("%d\n", k);
int i=0;
for(; i<k; i++) printf("%02x", buf[i]);
}
This causes a lot of garbage to be printed out. Removing the fwrite causes "0" to be printed with no further output.
The garbage is not from the uninitialized buffer, since I've verified that the original contents of the buffer are not what is being printed out. Furthermore, adjusting 2048 produces a maximum output of 4092 bytes (even with 9999 in place of 2048).
Short of simply denying read() on writable files, I don't really see an easy way around this libc bug.
|
msg98613 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2010-01-31 13:17 |
I wonder if this is technically a bug. The stream is not opened for reading and yet you do an fread. I quickly glanced through the
C-Standard and I did not find an _explicit_ paragraph that this is
undefined behavior, but personally I'd expect it to be.
Regarding the last C example. Also here, you do not open using "w+".
Now, even if you used "w+", by the standard you'd have to do an fflush,
fseek, fsetpos or rewind before reading. I don't see a libc bug.
|
msg98617 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2010-01-31 15:22 |
Bug or not, this can be handled since fread sets EBADF on Windows.
|
msg98619 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2010-01-31 16:18 |
The patch should come with a test.
|
msg98663 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2010-02-01 14:15 |
I see that a complete errno based solution would get messy. To avoid
interfering with EAGAIN special cases, this would be needed:
#if defined(EBADF)
#define ERRNO_EBADF(x) ((x) == EBADF)
#else
#define ERRNO_EBADF(x) 0
#endif
Then, additional checks would need to be added to get_line,
getline_via_fgets and a couple of other places.
I think the right thing is to add a field f_modeflags to the file
object, to be initialized on creation. Or use f->readable, f->writable
like py3k. Thoughts?
|
msg98664 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2010-02-01 14:37 |
I think checking errno would be fine (provided it eliminates all variations ot this bug).
|
msg98740 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2010-02-02 14:41 |
An errno+ferror approach works except for two cases:
(1) MS fgets() does not consider reading from a write-only stream
an error. While annoying, I think this is standard conforming.
(2) OpenSolaris sets errno unreliably after getc.
I do not know how to get around those without using OS specifics.
|
msg98741 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2010-02-02 14:51 |
> An errno+ferror approach works except for two cases:
>
> (1) MS fgets() does not consider reading from a write-only stream
> an error. While annoying, I think this is standard conforming.
>
> (2) OpenSolaris sets errno unreliably after getc.
>
>
> I do not know how to get around those without using OS specifics.
Ok, then perhaps it's better to have some internal {readable, writable}
flags based on the original mode string.
|
msg98782 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2010-02-03 13:52 |
The new patch takes over the logic from fileio.c. Tested on Linux, Windows x86/x64 and OpenSolaris.
|
msg98831 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2010-02-04 12:41 |
Hmm, test_file.py is for the new IO library (backported from 3.x). You should use test_file2k.py for tests against the 2.x file object.
Also, readinto() should take something such as a bytearray() as argument, not a string.
|
msg98870 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2010-02-05 13:31 |
Thanks Antoine, I fixed both issues.
|
msg98871 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2010-02-05 13:55 |
Oh and the following line shouldn't be needed:
data = b'xxx' if 'b' in mode else u'xxx'
Since old-style file objects always take bytestrings (it's harmless, though, because the unicode string will be implicitly converted).
|
msg98874 - (view) |
Author: Stefan Krah (skrah) * |
Date: 2010-02-05 14:35 |
Ok, I'll fix that. Perhaps I should also remove the comment in err_mode(). I wonder if the ValueError in fileio.c is intentional:
>>> import io
>>> f = io.open("testfile", "w")
>>> f.write("xxx")
3
>>> f.read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: not readable
>>>
>>> f = io.FileIO("testfile", "w")
>>> f.write("xxx")
3
>>> f.read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: File not open for reading
|
msg98875 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2010-02-05 14:48 |
> I wonder if the ValueError in fileio.c is intentional:
I don't know, but this would be the subject of a separate issue anyway.
|
msg98885 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2010-02-05 17:13 |
The patch was committed in r77989 (trunk) and r77990 (2.6). Thank you Stefan!
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:47 | admin | set | github: 49927 |
2010-02-05 17:13:40 | pitrou | set | status: open -> closed resolution: fixed messages:
+ msg98885
stage: test needed -> resolved |
2010-02-05 16:38:23 | skrah | set | files:
+ add-readable-writable3.patch |
2010-02-05 14:48:00 | pitrou | set | messages:
+ msg98875 |
2010-02-05 14:35:22 | skrah | set | messages:
+ msg98874 |
2010-02-05 13:55:41 | pitrou | set | messages:
+ msg98871 |
2010-02-05 13:31:11 | skrah | set | files:
+ add-readable-writable2.patch
messages:
+ msg98870 |
2010-02-04 12:41:23 | pitrou | set | messages:
+ msg98831 versions:
+ Python 2.7 |
2010-02-04 12:29:41 | pitrou | set | nosy:
+ loewis
|
2010-02-03 13:52:16 | skrah | set | files:
+ add-readable-writable.patch
messages:
+ msg98782 |
2010-02-02 14:51:49 | pitrou | set | messages:
+ msg98741 |
2010-02-02 14:41:41 | skrah | set | files:
+ errno-ebadf-tests.patch |
2010-02-02 14:41:17 | skrah | set | files:
+ errno-ebadf.patch
messages:
+ msg98740 |
2010-02-01 14:37:12 | pitrou | set | messages:
+ msg98664 |
2010-02-01 14:15:57 | skrah | set | messages:
+ msg98663 |
2010-01-31 16:18:50 | pitrou | set | messages:
+ msg98619 |
2010-01-31 15:22:30 | skrah | set | files:
+ fileread-errno.patch keywords:
+ patch messages:
+ msg98617
|
2010-01-31 13:17:38 | skrah | set | nosy:
+ skrah messages:
+ msg98613
|
2010-01-30 23:40:20 | nneonneo | set | messages:
+ msg98582 |
2010-01-30 22:24:46 | pmpp | set | nosy:
+ pmpp
messages:
+ msg98577 versions:
- Python 2.5 |
2009-05-16 01:13:42 | ajaksu2 | set | versions:
- Python 2.4 nosy:
+ pitrou, benjamin.peterson
priority: high components:
+ IO stage: test needed |
2009-04-03 01:34:48 | nneonneo | set | messages:
+ msg85287 |
2009-04-03 01:24:42 | nneonneo | create | |