Issue1706039
Created on 2007-04-23 17:21 by josm, last changed 2009-05-15 12:55 by pitrou.
| File name |
Uploaded |
Description |
Edit |
Remove |
|
fileobject.diff
|
josm,
2007-04-23 17:21
|
Objects/fileobject.c patch against trunk |
|
|
|
eof.diff
|
josm,
2007-04-28 17:03
|
revised patch and its test cases |
|
|
|
eof2.diff
|
josm,
2007-05-03 10:03
|
Yet another patch |
|
|
|
fileobject.diff
|
scott.dial,
2008-12-22 14:42
|
Patch against the r67739 on release25-maint |
|
|
|
issue1706039.diff
|
belopolsky,
2008-12-23 15:47
|
|
|
|
|
msg52505 - (view) |
Author: John Smith (josm) |
Date: 2007-04-23 17:21 |
|
This patch is to fix bug 1523853.
Now file.read() works as written in the doc, which says
"An empty string is returned when EOF is encountered immediately."
Before this fix, An empty string was returned even when the last
read() encountered EOF.
|
|
msg52506 - (view) |
Author: Alexander Belopolsky (belopolsky) |
Date: 2007-04-23 20:22 |
|
I was not able to reproduce the problem. I tried to read through the end of the file, append to the file from an external process and read again. It worked as expected: more data was read.
The proposed patch seems to be redundant: once ferror(f->f_fp) is known to be 0, clearerr will do nothing and in the alternative branch clearerr is already there.
|
|
msg52507 - (view) |
Author: John Smith (josm) |
Date: 2007-04-23 23:27 |
|
I can easily reproduce this problem on OS X 10.4.9 using Python 2.4.3 and Python 2.5.
Here's the code
#################
import os, sys
filename = "spam"
f = open(filename, "w+")
f.seek(0, 2)
line = f.read() # EOF
# Writing the same file using another fd
f2 = open(filename, "a+")
f2.write("Spam")
f2.flush()
f2.close()
statinfo = os.stat(filename)
print "file size: %d" % statinfo.st_size
print "position : %d" % f.tell()
line = f.read() # read() returns emtpy!! readlines?() works ok
print "line : [%s]" % line
#################
On my machine, it outputs the following
###
file size: 4
position : 0
line : []
###
And just now I found that on ubuntu on the same machine (vmware),
it works collect.
###
file size: 4
position : 0
line : [Spam]
###
My patched version python outputs the same result as above on OS X.
We need clearerr() there, too because clearerr()'s job is not only only clearing error indicators
but EOF indicators also.
|
|
msg52508 - (view) |
Author: Alexander Belopolsky (belopolsky) |
Date: 2007-04-24 04:28 |
|
Hmm. It looks like glibc's fread will attempt reading even if EOF flag is already set.
I've reproduced the problem on OS X 10.4.9 and Python 2.6a0 (trunk:54926M).
This patch fixes it.
|
|
msg52509 - (view) |
Author: John Smith (josm) |
Date: 2007-04-24 17:15 |
|
I confirmed that fread can read data from fp in EOF state on Linux. On OS X, freading from fp in EOF state doesn't work,
just return 0.
I tested this behavior using the following code.
----------------------------
#include <stdio.h>
#include <stdlib.h>
main(void)
{
FILE *fp1, *fp2;
int n;
char buf[128];
fp1 = fopen("spam.txt", "w+");
fputs("foo", fp1);
fseek(fp1, 0L, SEEK_END);
if (fread(buf, 1, sizeof(buf), fp1) == 0)
fprintf(stderr, "first fread failed\n");
fp2 = fopen("spam.txt", "a+");
fputs("bar", fp2);
fclose(fp2);
if (feof(fp1))
fprintf(stderr, "after appending some text, fp1 is still eof\n");
if (fread(buf, 1, sizeof(buf), fp1) == 0)
fprintf(stderr, "second fread failed\n");
printf("buf: %s\n", buf);
fclose(fp1);
return 0;
}
----------------------------
=============
On Linux
=============
first fread failed
after appending some text, fp1 is still eof
buf: bar
=============
On OS X
=============
first fread failed
after appending some text, fp1 is still eof
second fread failed
buf:
Anyway, I think it's safe and preferable to clear EOF indicator in this case.
|
|
msg52510 - (view) |
Author: Alexander Belopolsky (belopolsky) |
Date: 2007-04-25 03:14 |
|
I have just verified that this issue affects readlines and readinto methods on Mac OS X as well. This tells me that rather than patching all the individual methods, maybe Py_UniversalNewlineFread should be fixed instead. I believe that the current situation where a function that looks like Python API reports errors by setting flags on a FILE object instead of settin the an exception is against python's design. I suggest changing Py_UniversalNewlineFgets so that if fread returns 0, it checks whether that was due to EOF or an error and sets an exception in a case of an error and always calls clearerr.
|
|
msg52511 - (view) |
Author: John Smith (josm) |
Date: 2007-04-28 17:03 |
|
New patch Attached.
I believe this one includes all belopolsky's suggestions.
Now all dirty jobs are done in Py_UniversalNewlineFread:
1. calls clearerr if ferror or feof
2. set IOErrorr exception and return
The caller just checks whether IOError is occurred or not
and does appropriate jobs.
The attachment also contains unittest for this problem.
File Added: eof.diff
|
|
msg52512 - (view) |
Author: Alexander Belopolsky (belopolsky) |
Date: 2007-05-01 14:38 |
|
The latest patch works on Mac OS 10.4 . I would suggest a few cosmetic changes:
1. Put the flag checking logic in a separate function and call it instead of fread. This will eliminate code duplication.
2. "else" is redundant in
+ if (ferror(stream)) {
+ clearerr(stream);
+ PyErr_SetFromErrno(PyExc_IOError);
+ return 0;
+ } else if (feof(stream)) {
+ clearerr(stream);
+ }
The flow will be more obvious if you write
+ if (ferror(stream)) {
+ clearerr(stream);
+ PyErr_SetFromErrno(PyExc_IOError);
+ return 0;
+ }
+ if (feof(stream)) {
+ clearerr(stream);
+ }
3. Please mention bug 1523853 in a comment near the test.
otherwise, the patch looks fine.
|
|
msg52513 - (view) |
Author: John Smith (josm) |
Date: 2007-05-03 10:03 |
|
The flag checking is revised.
I removed feof() so the checking logic become shorter.
I didn't factor out that logic
because I think the logic is simple enough
Added a comment that mentions bug 1523853.
File Added: eof2.diff
|
|
msg52514 - (view) |
Author: John Smith (josm) |
Date: 2007-06-05 14:56 |
|
Could someone please review my patch and give me some feedback?
|
|
msg58043 - (view) |
Author: John Smith (josm) |
Date: 2007-12-01 12:19 |
|
Any chance to get this fix commmited in?
|
|
msg58134 - (view) |
Author: Raghuram Devarakonda (draghuram) |
Date: 2007-12-03 17:47 |
|
I get an error when I try to read 1523853. Is this not moved to the new
tracker or others are able to access it?
|
|
msg58137 - (view) |
Author: Martin v. Löwis (loewis) |
Date: 2007-12-03 18:05 |
|
Yes, a number of items were not moved, as SF failed to provide them on
export.
|
|
msg58181 - (view) |
Author: Martin v. Löwis (loewis) |
Date: 2007-12-04 09:02 |
|
jos, can you please provide a real name in "Your Details" of this
tracker? We cannot accept anonymous/pseudonymous patches.
|
|
msg58185 - (view) |
Author: John Smith (josm) |
Date: 2007-12-04 13:02 |
|
What's in a name? :p
Done, anyway.
|
|
msg76622 - (view) |
Author: John Smith (josm) |
Date: 2008-11-29 19:44 |
|
Hi, when will this patch be checkedin?
|
|
msg77748 - (view) |
Author: Martin v. Löwis (loewis) |
Date: 2008-12-13 15:37 |
|
Thanks for the patch. Committed into the 2.5 branch as r67740. This
still needs to be applied to the other branches.
|
|
msg78161 - (view) |
Author: Martin v. Löwis (loewis) |
Date: 2008-12-21 23:35 |
|
Unfortunately, the patch is broken. The program
fname='test123'
f=open(fname,'w')
f.read()
crashes with the patch applied.
I think I will revert the patch in 2.5.3, release 2.5.4, and reject the
patch.
|
|
msg78162 - (view) |
Author: Guilherme Polo (gpolo) |
Date: 2008-12-21 23:42 |
|
It isn't being careful when calling PyErr_SetFromErrno inside the
Py_UniversalNewlineFread function since this function is being called
all over fileobject after releasing the GIL.. so, isn't this just a
matter of adding pairs of PyGILState_Ensure/PyGILState_Release around
these calls to PyErr_SetFromErrno in this specific function ?
|
|
msg78163 - (view) |
Author: Martin v. Löwis (loewis) |
Date: 2008-12-21 23:52 |
|
> It isn't being careful when calling PyErr_SetFromErrno inside the
> Py_UniversalNewlineFread function since this function is being called
> all over fileobject after releasing the GIL.. so, isn't this just a
> matter of adding pairs of PyGILState_Ensure/PyGILState_Release around
> these calls to PyErr_SetFromErrno in this specific function ?
Perhaps that could fix this problem (or perhaps not - is
PyGILState_Ensure guaranteed to do the right thing, even in the
presence of multiple interpreters?)
My concern is that this patch may contain other bugs. I am apparently
incapable of reviewing it correctly, and nobody else has offered a
careful review.
|
|
msg78164 - (view) |
Author: Guilherme Polo (gpolo) |
Date: 2008-12-21 23:55 |
|
On Sun, Dec 21, 2008 at 9:52 PM, Martin v. Löwis <report@bugs.python.org> wrote:
>
> Martin v. Löwis <martin@v.loewis.de> added the comment:
>
>> It isn't being careful when calling PyErr_SetFromErrno inside the
>> Py_UniversalNewlineFread function since this function is being called
>> all over fileobject after releasing the GIL.. so, isn't this just a
>> matter of adding pairs of PyGILState_Ensure/PyGILState_Release around
>> these calls to PyErr_SetFromErrno in this specific function ?
>
> Perhaps that could fix this problem (or perhaps not - is
> PyGILState_Ensure guaranteed to do the right thing, even in the
> presence of multiple interpreters?)
It is said to be unsupported in that case, but I guess you knew that.
|
|
msg78187 - (view) |
Author: Scott Dial (scott.dial) |
Date: 2008-12-22 14:42 |
|
I believe the original patch is overreaching. I believe the changes to
fileobject.c should've been much simpler, just as the title describes
(adding clearerr() just before performing reads on the FILE handles).
I've attached a much simpler patch that I believe corrects the issue,
but I don't have an OS X platform to try it on.
|
|
msg78238 - (view) |
Author: Martin v. Löwis (loewis) |
Date: 2008-12-23 15:07 |
|
I have now reverted the patch in r67914. I won't reject the patch
because of Scott's alternative, but leave it open for review. Since
Scott's patch is not approved yet, this is not a release blocker anymore.
|
|
msg78239 - (view) |
Author: Alexander Belopolsky (belopolsky) |
Date: 2008-12-23 15:47 |
|
I've combined Scott's patch with John's (?) test case in issue1706039.diff. Confirmed that the patch fixes the issue on Mac OS X
10.5.6 (Intel).
|
|
msg78240 - (view) |
Author: Alexander Belopolsky (belopolsky) |
Date: 2008-12-23 16:03 |
|
Maybe not a problem, but an inconsistency: in Py_UniversalNewlineFgets
clearerr is added inside univ_newline conditional and then again before
the loop over characters, but in Py_UniversalNewlineFread it is added only
once before "if (!f->f_univ_newline)". I am not sure which approach is
right, but I don't think there is a reason for the two functions to be
different.
|
|
msg78242 - (view) |
Author: John Smith (josm) |
Date: 2008-12-23 17:05 |
|
Sorry for inconvenience caused with the patch.
I should've got more reviews before asking it checked in.
I verified eof2.diff-applied-python gets "Bus error"
and Scott Dial's fileobject.diff fixes this.
(Test was on 9.6.0 Darwin Kernel Version 9.6.0)
|
|
msg78243 - (view) |
Author: Scott Dial (scott.dial) |
Date: 2008-12-23 17:13 |
|
They differ because in Py_UniversalNewlineFgets() there is a call to
FLOCKFILE(), and it seemed appropriate to ensure that clearerr() was
called after locking the FILE stream. I certainly pondered over whether
it made a difference to do it before or after, and it would seem to me
that if we are bothering to lock the FILE stream then we would care to
ensure that the first GETC() after locking reaps the benefit of having
called clearerr(). In the cases that we fall into fread and fgets, we
may actually need to be doing:
FLOCKFILE(stream)
clearerr(stream);
result = fgets(buf, n, stream);
FUNLOCKFILE(stream);
return result;
One nitpick is that the testcase was appended to the StdoutTests class
instead of OtherFileTests class, which would seem more appropriate.
|
|
msg87815 - (view) |
Author: Antoine Pitrou (pitrou) |
Date: 2009-05-15 12:55 |
|
Does it apply to 3.1?
|
|
| Date |
User |
Action |
Args |
| 2009-05-15 12:55:17 | pitrou | set | messages:
+ msg87815 |
| 2009-05-15 02:10:20 | ajaksu2 | set | versions:
+ Python 3.1, - Python 3.0 nosy:
+ benjamin.peterson, pitrou
priority: normal components:
+ IO stage: patch review |
| 2008-12-23 17:13:22 | scott.dial | set | messages:
+ msg78243 |
| 2008-12-23 17:05:12 | josm | set | messages:
+ msg78242 |
| 2008-12-23 16:03:25 | belopolsky | set | messages:
+ msg78240 |
| 2008-12-23 15:47:23 | belopolsky | set | files:
+ issue1706039.diff messages:
+ msg78239 |
| 2008-12-23 15:07:50 | loewis | set | priority: release blocker -> (no value) assignee: loewis -> messages:
+ msg78238 |
| 2008-12-22 14:42:40 | scott.dial | set | files:
+ fileobject.diff nosy:
+ scott.dial messages:
+ msg78187 |
| 2008-12-21 23:55:25 | gpolo | set | messages:
+ msg78164 |
| 2008-12-21 23:52:12 | loewis | set | messages:
+ msg78163 |
| 2008-12-21 23:42:45 | gpolo | set | nosy:
+ gpolo messages:
+ msg78162 |
| 2008-12-21 23:35:21 | loewis | set | messages:
+ msg78161 |
| 2008-12-20 14:37:04 | loewis | set | versions:
- Python 2.5.3 |
| 2008-12-20 02:41:45 | loewis | set | priority: deferred blocker -> release blocker |
| 2008-12-13 15:37:23 | loewis | set | priority: release blocker -> deferred blocker messages:
+ msg77748 |
| 2008-12-10 09:10:00 | loewis | set | priority: normal -> release blocker |
| 2008-11-29 19:44:48 | josm | set | messages:
+ msg76622 versions:
+ Python 2.6, Python 3.0, Python 2.5.3 |
| 2008-01-21 21:36:44 | draghuram | set | nosy:
- draghuram |
| 2007-12-04 13:02:06 | josm | set | messages:
+ msg58185 |
| 2007-12-04 09:02:08 | loewis | set | messages:
+ msg58181 |
| 2007-12-03 18:05:17 | loewis | set | nosy:
+ loewis messages:
+ msg58137 |
| 2007-12-03 17:47:56 | draghuram | set | nosy:
+ draghuram messages:
+ msg58134 |
| 2007-12-01 12:19:34 | josm | set | type: behavior messages:
+ msg58043 |
| 2007-08-24 19:39:00 | georg.brandl | set | assignee: loewis |
| 2007-04-23 17:21:41 | josm | create | |
|