Message 11992 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	jhylton
Recipients
Date	2002-08-15.16:34:16
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
We recently found an assertion failure in the universal newline support when running a multithreaded program where two threads used the same Python file object. The assert(stream != NULL) test in Py_UniversalNewlineFread() fails once in a blue moon, where stream is the stdio FILE * that the fileobject wraps. Further analysis suggests that there is a race condition between checking FILE * and using FILE * that exists in at least Python 2.1 and up. I'll actually describe the problem as it exists in Python 2.2, because it is simpler to avoid the universal newline code. That code isn't the source of the problem, although it's assert() uncovers it in a clear way. In file_read() (rev 2.141.6.5), the first thing it does is check if f_fp (the FILE ) is NULL. If so it raises an IOError -- operation on closed file object. Later, file_read() enters a for loop that calls fread() until enough bytes have been read. for (;;) { Py_BEGIN_ALLOW_THREADS errno = 0; chunksize = fread(BUF(v) + bytesread, 1, buffersize - bytesread, f->f_fp); Py_END_ALLOW_THREADS if (chunksize == 0) { if (!ferror(f->f_fp)) break; PyErr_SetFromErrno(PyExc_IOError); clearerr(f->f_fp); Py_DECREF(v); return NULL; } The problem is that fread() is called after the global interpreter lock is released. Since the lock is released, another Python thread could run and modify the file object, changing the value of f->f_fp. Under the current interpreter lock scheme, it isn't safe to use f->f_fp without holding the interpreter lock. The current file_read() code can fail in a variety of ways. It's possible for a second thread to close the file, which will set f->f_fp to NULL. Who knows what fread() will do when NULL is passed. The universal newline code is squirrels the FILE in a local variable, which is worse. If it happens that another thread closes the file, at best the local points to a closed FILE *. But that memory could get recycled and then there's no way to know what it points to. socket I/O has a similar problem with unsafe sharing of the file descriptor. However, this problem seems less severe in general, because we'd just be passing a bogus file descriptor to a system call. We don't have to worry about whether stdio will dump core when passed a bogus pointer. There is a chance the a socket will be closed and its file descriptor used for a different socket. So a call to recv() with one socket ends up using a different socket. That will be a nightmare to debug, but it won't cause a segfault. (And, in general, files and sockets shouldn't be shared between application threads unless the application is going to make sure its safe.) The solution to this problem is to use a per-file-object lock to guard access to f->f_fp. No thread should read or right f->f_fp without holding the lock. To make sure that other threads get a chance to run when there is contention for the file, the file-object lock should never be held when the GIL is held.

We recently found an assertion failure in the universal
newline support when running a multithreaded program
where two threads used the same Python file object. 
The assert(stream != NULL) test in 
Py_UniversalNewlineFread() fails once in a blue moon,
where stream is the stdio FILE * that the fileobject
wraps.  Further analysis suggests that there is a race
condition between checking FILE * and using FILE * that
exists in at least Python 2.1 and up.

I'll actually describe the problem as it exists in
Python 2.2, because it is simpler to avoid the
universal newline code.  That code isn't the source of
the problem, although it's assert() uncovers it in a
clear way.

In file_read() (rev 2.141.6.5), the first thing it does
is check if f_fp (the FILE *) is NULL.  If so it raises
an IOError -- operation on closed file object.  Later,
file_read() enters a for loop that calls fread() until
enough bytes have been read.

	for (;;) {
		Py_BEGIN_ALLOW_THREADS
		errno = 0;
		chunksize = fread(BUF(v) + bytesread, 1,
				  buffersize - bytesread, f->f_fp);
		Py_END_ALLOW_THREADS
		if (chunksize == 0) {
			if (!ferror(f->f_fp))
				break;
			PyErr_SetFromErrno(PyExc_IOError);
			clearerr(f->f_fp);
			Py_DECREF(v);
			return NULL;
		}

The problem is that fread() is called after the global
interpreter lock is released.  Since the lock is
released, another Python thread could run and modify
the file object, changing the value of f->f_fp.  Under
the current interpreter lock scheme, it isn't safe to
use f->f_fp without holding the interpreter lock.

The current file_read() code can fail in a variety of
ways.  It's possible for a second thread to close the
file, which will set f->f_fp to NULL.  Who knows what
fread() will do when NULL is passed.

The universal newline code is squirrels the FILE * in a
local variable, which is worse.  If it happens that
another thread closes the file, at best the local
points to a closed FILE *.  But that memory could get
recycled and then there's no way to know what it points to.

socket I/O has a similar problem with unsafe sharing of
the file descriptor.  However, this problem seems less
severe in general, because we'd just be passing a bogus
file descriptor to a system call.  We don't have to
worry about whether stdio will dump core when passed a
bogus pointer.  There is a chance the a socket will be
closed and its file descriptor used for a different
socket.  So a call to recv() with one socket ends up
using a different socket.  That will be a nightmare to
debug, but it won't cause a segfault.  (And, in
general, files and sockets shouldn't be shared between
application threads unless the application is going to
make sure its safe.)

The solution to this problem is to use a
per-file-object lock to guard access to f->f_fp.  No
thread should read or right f->f_fp without holding the
lock.  To make sure that other threads get a chance to
run when there is contention for the file, the
file-object lock should never be held when the GIL is held.

History
Date	User	Action	Args
2007-08-23 14:04:57	admin	link	issue595601 messages
2007-08-23 14:04:57	admin	create