This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test_close_open_print_buffered(test_file) sometimes crashes
Type: crash Stage: resolved
Components: Interpreter Core, Windows Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: flox, ocean-city, pitrou
Priority: normal Keywords: buildbot, patch

Created on 2010-07-18 14:59 by ocean-city, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
py26_debug_threaded_close.patch ocean-city, 2010-07-18 15:13 patch to see what's happening
py26_experimental_patch_on_fileobject.patch ocean-city, 2010-07-18 15:31 Patch to workaround the crash
py27_fix_threaded_file_close.patch ocean-city, 2010-09-10 11:14 final proposal
Messages (18)
msg110652 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-07-18 14:59
I noticed test_close_open_print_buffered(test_file) sometimes crashes on release26-maint branch. This is most problamatic part in my python2.6 test survey on VC6 + Win2K.

Here is stacktrace.

_write_lk(int 6, const void * 0x00b49240, unsigned int 11) line 155 + 3 bytes
_write(int 6, const void * 0x00b49240, unsigned int 11) line 79 + 17 bytes
_flush(_iobuf * 0x102618e8) line 162 + 23 bytes
_fclose_lk(_iobuf * 0x102618e8) line 130 + 9 bytes
fclose(_iobuf * 0x102618e8) line 67 + 9 bytes
close_the_file(PyFileObject * 0x00a9e6b8) line 370 + 7 bytes
file_close(PyFileObject * 0x00a9e6b8) line 570 + 9 bytes
call_function(_object * * * 0x00fdf350, int 11134648) line 3734 + 204 bytes
PyEval_EvalFrameEx(_frame * 0x00b33d50, int 131) line 2414
(snip)

Python crashes when MSVCRT touches internal buffer of file object via flush() in close().

I think this is multi-threading bug related to internal buffer of file object. I'll post more.
msg110655 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-07-18 15:16
I saw value of local variable *thread* was 5384 and internal buffer's
address was 0x102618c8 (sorry, this is another stacktrace which differs
from previous one)

thread(5384) func(file_close): enter....
thread(5384) func(close_the_file): enter....
thread(5384) func(close_the_file): file(00A1FB18) address(00B18FD0): local_close enter...
thread(5376) func(file_close): enter....
thread(5376) func(close_the_file): enter....
thread(5376) func(close_the_file): leave....
thread(5376) func(file_close): file(00A1FB18) address(00B18FD0): PyMem_Free
thread(5376) func(file_close): leave....

Thread 5384 entered close_the_file() and ran local_close() and allowed
another thread to run. Before this, fp->f_fp was set to NULL. Thread 5376
entered close_the_file() and see fp->f_fp is NULL, and returned immediately
and freed fp->f_setbuf. Interesting point is, thread 5384 was still running
close(2), so flush(2) called by this function touched this buffer and crashed.
msg110656 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-07-18 15:31
I created a patch to workaround this crash. But I saw another problamatic phenomenon "Error without exception set".

I confirmed PyErr_Occurred() returns TRUE when leaving file_close() on error, but I gave up.

test_close_open_print_buffered (__main__.FileThreadingTests) ... close failed in
 file object destructor:
IOError: [Errno 0] Error
Exception in thread Thread-1:
Traceback (most recent call last):
  File "e:\python-dev\release26-maint\lib\threading.py", line 532, in __bootstra
p_inner
    self.run()
  File "e:\python-dev\release26-maint\lib\threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "test_custom.py", line 94, in worker
    f()
  File "test_custom.py", line 88, in <lambda>
    lambda: self._close_and_reopen_file(),
  File "test_custom.py", line 54, in _close_and_reopen_file
    self._close_file()
  File "test_custom.py", line 49, in _close_file
    self.f.close()
SystemError: error return without exception set

Qclose failed in file object destructor:

IOError: [Errno 0] Error
Exception in thread Thread-4:
Traceback (most recent call last):
  File "e:\python-dev\release26-maint\lib\threading.py", line 532, in __bootstra
p_inner
    self.run()
  File "e:\python-dev\release26-maint\lib\threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "test_custom.py", line 94, in worker
    f()
  File "test_custom.py", line 88, in <lambda>
    lambda: self._close_and_reopen_file(),
  File "test_custom.py", line 54, in _close_and_reopen_file
    self._close_file()
  File "test_custom.py", line 49, in _close_file
    self.f.close()
SystemError: error return without exception set

4271 88.3166 ok

----------------------------------------------------------------------
Ran 1 test in 6.579s

OK
msg110659 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-18 15:45
The proposed resolution is wrong. As the comment says, there is a specific reason why f->fp is NULLed at this point.

It appears that maybe MSVCRT is not entirely thread-safe with respect to buffered file I/O. Perhaps the test can simply be disabled under Windows. Is test_close_open_print_buffered the only test that fails?
msg110666 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-07-18 17:10
Maybe I am wrong, but when another thread calls File.close(), there is no system call on the code path, so I think still crash can happen nevertheless system call is thread safe.
msg110667 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-18 17:19
> Maybe I am wrong, but when another thread calls File.close(), there is
> no system call on the code path,

There is, and that's precisely what's the various tests test for.
(closing a file from a thread while it's being used from another thread)
msg110669 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-07-18 17:35
Hmm, correct me if following understanding is wrong somewhere...
1. File.close() is actually file_close(), and is calling close_the_file().
2. Returns immediately because local_fp == f->f_fp is already NULL.
   The return value is None.
3. sts is non-NULL, so PyMem_Free(f->f_setbuf) happens.
4. There is no system call for FILE object, so thread won't wait for close(2) completion.

Maybe can we fix this issue by the patch like this? I moved PyMem_Free
into close_the_file(), and called it only when close operation
succeeded.

Index: Objects/fileobject.c
===================================================================
--- Objects/fileobject.c	(revision 82910)
+++ Objects/fileobject.c	(working copy)
@@ -371,9 +371,14 @@
             Py_END_ALLOW_THREADS
             if (sts == EOF)
                 return PyErr_SetFromErrno(PyExc_IOError);
-            if (sts != 0)
+            if (sts != 0) {
+                PyMem_Free(f->f_setbuf);
+                f->f_setbuf = NULL;
                 return PyInt_FromLong((long)sts);
+            }
         }
+        PyMem_Free(f->f_setbuf);
+        f->f_setbuf = NULL;
     }
     Py_RETURN_NONE;
 }
@@ -567,12 +572,7 @@
 static PyObject *
 file_close(PyFileObject *f)
 {
-    PyObject *sts = close_the_file(f);
-    if (sts) {
-        PyMem_Free(f->f_setbuf);
-        f->f_setbuf = NULL;
-    }
-    return sts;
+    return close_the_file(f);
 }
msg110670 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-07-18 17:39
I forgot to mention that I didn't see any "Error without exception set" by my last patch. I agree f->f_fp = NULL; is needed there. ;-)
msg110671 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-18 18:33
> Hmm, correct me if following understanding is wrong somewhere...
> 1. File.close() is actually file_close(), and is calling close_the_file().
> 2. Returns immediately because local_fp == f->f_fp is already NULL.
>    The return value is None.

I don't understand why you say that. Why is f->f_fp already NULL while
close() still hasn't been called?

By the way, is it 2.6-specific or does it also happen on 2.7?
msg110672 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-07-18 19:09
Does the following patch solve your issue?


Index: Python/ceval.c
===================================================================
--- Python/ceval.c	(révision 82959)
+++ Python/ceval.c	(copie de travail)
@@ -1763,6 +1763,8 @@
                If __getattr__ raises an exception, w will
                be freed, so we need to prevent that temporarily. */
             Py_XINCREF(w);
+            if (PyFile_Check(w))
+                PyFile_IncUseCount((PyFileObject *) w);
             if (w != NULL && PyFile_SoftSpace(w, 0))
                 err = PyFile_WriteString(" ", w);
             if (err == 0)
@@ -1790,6 +1792,8 @@
                 else
                     PyFile_SoftSpace(w, 1);
             }
+            if (PyFile_Check(w))
+                PyFile_DecUseCount((PyFileObject *) w);
             Py_XDECREF(w);
             Py_DECREF(v);
             Py_XDECREF(stream);
msg110705 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-07-19 01:50
> Does the following patch solve your issue?

I tried, but it still crashed. I could reproduce the crash by following
script. It crashed also on Python2.7, but it doesn't crash on Python3.
And when I used io.open() instead of builtin open(), it doesn't crash
even on Python2.x. (Probably because io.open implements buffer protocol
by itself)

//////////////////////////////////////////////////////

import threading
import sys

if sys.version_info.major == 3:
    xrange = range

def main():
    size = 1024 * 1024 # large enough
    f = open("__temp__.tmp", "w", size)
    for _ in xrange(size):
        f.write("c")
    t1 = threading.Thread(target=f.close)
    t2 = threading.Thread(target=f.close)
    t1.start()
    t2.start()
    t1.join()
    t2.join()

if __name__ == '__main__':
    main()

//////////////////////////////////////////////////////

I lied a bit on msg110655. close(2) is not used, that is actually
fclose(). I didn't notice the variable *close* was declared as
the parameter of fill_file_fields().
msg113536 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-08-10 13:31
I cannot crash python by above patch anymore... Strange.

> Is test_close_open_print_buffered the only test that fails?

It seem to be.

Is freeing the memory allocated for setvbuf() while close() running in another thread really safe?
msg113537 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-08-10 13:40
- above patch
+ above script
msg113551 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010-08-10 17:28
See the transient failures on "x86 XP-4" buildbot (2.6 and 2.7 only).

Issue #9555 closed as duplicate:
http://bugs.python.org/issue9555#msg113522
msg113961 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-08-15 14:24
> Is freeing the memory allocated for setvbuf() while close() running in 
> another thread really safe?

I don't know. I guess nobody does that.
msg116005 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2010-09-10 11:14
I believe attached patch will fix this issue. Please forget
previous patch.
msg119792 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-28 14:53
Hirokazu, I've committed your 2.7 patch in r85892. Let's see what the buildbots say.
msg119799 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-28 16:13
Apparently it's fixed!
History
Date User Action Args
2022-04-11 14:57:03adminsetgithub: 53541
2010-10-28 16:13:16pitrousetstatus: pending -> closed

messages: + msg119799
stage: needs patch -> resolved
2010-10-28 14:53:24pitrousetstatus: open -> pending
resolution: fixed
messages: + msg119792

versions: - Python 2.6
2010-09-10 11:14:03ocean-citysetfiles: + py27_fix_threaded_file_close.patch

messages: + msg116005
2010-08-15 14:24:32pitrousetmessages: + msg113961
2010-08-10 17:28:46floxsetnosy: + flox
messages: + msg113551

components: + Windows
keywords: + buildbot
2010-08-10 17:26:55floxlinkissue9555 superseder
2010-08-10 13:40:59ocean-citysetmessages: + msg113537
2010-08-10 13:31:06ocean-citysetmessages: + msg113536
2010-07-19 01:50:42ocean-citysetmessages: + msg110705
versions: + Python 2.7
2010-07-18 19:09:42pitrousetmessages: + msg110672
2010-07-18 18:33:15pitrousetmessages: + msg110671
2010-07-18 17:39:54ocean-citysetmessages: + msg110670
2010-07-18 17:35:38ocean-citysetmessages: + msg110669
2010-07-18 17:19:48pitrousetmessages: + msg110667
2010-07-18 17:10:24ocean-citysetmessages: + msg110666
2010-07-18 15:45:08pitrousetnosy: + pitrou
messages: + msg110659
2010-07-18 15:31:31ocean-citysetfiles: + py26_experimental_patch_on_fileobject.patch

messages: + msg110656
2010-07-18 15:26:41ocean-citysetfiles: - py26_debug_threaded_close.patch
2010-07-18 15:16:51ocean-citysetfiles: + py26_debug_threaded_close.patch

messages: + msg110655
2010-07-18 15:16:20ocean-citysetmessages: - msg110654
2010-07-18 15:14:00ocean-citysetfiles: + py26_debug_threaded_close.patch
keywords: + patch
messages: + msg110654
2010-07-18 14:59:34ocean-citycreate