classification
Title: Threads using same stream blow up (Windows)
Type: Stage:
Components: Windows Versions:
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: tim.peters Nosy List: gvanrossum, tim.peters
Priority: low Keywords:

Created on 2001-01-09 21:10 by tim.peters, last changed 2001-01-12 03:48 by tim.peters. This issue is now closed.

Messages (7)
msg2869 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2001-01-09 21:10
Blows up under released Windows 2.0 and CVS Pythons (so it's not due to anything new):

import thread

def read(f):
    import time
    time.sleep(.01)
    n = 0
    while n < 1000000:
        x = f.readline()
        n += len(x)
        print "r",
    print "read " + `n`
    m.release()

m = thread.allocate_lock()
f = open("ga", "w+")
print "opened"
m.acquire()
thread.start_new_thread(read, (f,))
n = 0
x = "x" * 113 + "\n"
while n < 1000000:
    f.write(x)
    print "w",
    n += len(x)
m.acquire()
print "done"

Typical run:

C:\Python20>\code\python\dist\src\pcbuild\python temp.py
opened
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w r w r
w r w r w r w r w r w r w r w r w r w r w r w r w r w r w r w r w
r r w r w r w r w r w r

and then it dies in msvcrt.dll with a bad pointer.  Also dies under the debugger (yay!) ... always dies like so:

+ We (Python) call the MS fwrite, from fileobject.c file_write.
+ MS fwrite succeeds with its _lock_str(stream) call.
+ MS fwrite then calls MS _fwrite_lk.
+ MS _fwrite_lk calls memcpy, which blows up for a non-obvious reason.

Looks like the stream's _cnt member has gone mildly negative, which _fwrite_lk casts to unsigned and so treats like a giant positive count, and so memcpy eventually runs off the end of the process address space.

Only thing I can conclude from this is that MS's internal stream-locking implementation is buggy.  At least on W98SE.  Other flavors of Windows?  Other platforms?

Note that I don't claim the program above is *sensible*, just that it shouldn't blow up.  Alas, short of adding a separate mutex in Python file objects-- or writing our own stdio --I don't believe I can fix this.
msg2870 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2001-01-09 22:41
Adding info communicated via email:

+ Also blows up under Windows 2000.

+ Yes, C doesn't define what happens here even if it weren't threaded (you can't mix reads and writes willy-nilly (without e.g. seeking before reading) and expect something sensible).  The program is not sensible.  But even silly programs shouldn't *blow up* in Python.
msg2871 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2001-01-10 08:37
Here's a standalone C program with the same symptom:

#include <process.h>
#include <stdio.h>

static FILE* fp;

void
reader(void* irrelevant)
{
	char buf[100];
	for (;;)  {
		char* p = fgets(buf, sizeof buf, fp);
		if (p) {
			putchar('r');
		}
	}
}

void
main()
{
	int i;
	char string[100];
	for (i = 0; i < sizeof(string) - 1; ++i) {
		string[i] = 'x';
	}
	string[sizeof(string) - 1] = '\n';

	fp = fopen("whatever", "w+");
	_beginthread(reader, 0, NULL);

	for (;;) {
		fwrite(string, 1, sizeof(string), fp);
		putchar('w');
	}
}
msg2872 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2001-01-10 14:48
Mixing reads and writes on the same stream is always a bad idea. I believe that the stdio docs prescribe that you must use fflush() whenever switching between reading and writing. We could enforce this by adding two flags, "reading" and "writing" to the file object.

(It's possible that using fseek() is also allowed to change directions.)

I'm not sure tht this is worth fixing though -- no sensible programmer will do this.
msg2873 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2001-01-10 18:33
Guido, I've said multiple times (and in this very bug report!) that the program isn't sensible.  That isn't the point.  Having multiple threads even just reading from the same stream without explicit user locks isn't sensible either, and you can't have it both ways:  on Python-Dev you said the point to using FLOCKFILE in get_line was to prevent core dumps in case the user was doing multiple reads by *accident*.

That's fine:  I agree, and a file open for update is likely just as prone to mixed read/write "accidents".

As far as the std goes, input following output should have intervening fflush, fseek, fsetpos or rewind; while output following input should-- unless input hit EOF --have intervening fseek, fsetpos or rewind (not fflush, though!  fflush immediately after input is undefined).

Adding flags to the file object doesn't seem to help much, because the combo of e.g. "I'm doing a read now" and "I'm setting the 'reading' flag" needs to be atomic in a threaded world else it's unreliable (and then you're back to Python-level locks to repair that).

Does this blow up on Linux for you?  That's more interesting to me.  I've implemented stdio in the past, and I think what's happening here is quite likely what it looks like:  a bug in MS's file-locking (the _cnt member just doesn't go negative if the FILE* struct is properly locked during manipulations!).  If that's true, this is surely rare enough that we can wait for MS to fix it.  But in the meantime, it is a way to get Python to coredump on Windows.
msg2874 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2001-01-11 14:59
I knew it wasn't sensible (I did read what you wrote).  I just don't like having bug reports that we can't do something about without a lot of hackery.  I've seen stdio implementations before that were totally naive about mixing reads and writes, even in a single-threaded program.

That said, I can't get this to crash on Linux.
msg2875 - (view) Author: Tim Peters (tim.peters) * (Python committer) Date: 2001-01-12 03:48
Since this appears to be unique to Windows, appears to be a bug in Windows stdio, has got to be exceedingly rare in practice, and unreasonably difficult to worm around in the Python implementation, closing this as WontFix.
History
Date User Action Args
2001-01-09 21:10:16tim.peterscreate