classification
Title: Fatal error on thread creation in low memory condition
Type: crash Stage: test needed
Components: Extension Modules Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, haypo, jnoller, nirai, pitrou
Priority: normal Keywords: patch

Created on 2009-12-19 12:49 by haypo, last changed 2010-03-21 13:33 by haypo. This issue is now closed.

Files
File name Uploaded Description Edit
thread_prealloc_pystate-4.patch haypo, 2010-01-31 23:50
Messages (20)
msg96602 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2009-12-19 12:49
Using my fuzzer (Fusil) on Python trunk, I got sometimes errors on
multiprocessing.Pool():

   Fatal Python error: PyEval_AcquireThread: NULL new thread state

I'm sorry but I don't have example script to reproduce the bug. I
suppose that the error depends on the system load. I have the error on
an Intel Core2 Quad Core Q9300 (CPU able to run 4 processes/threads at
the time same).

How can I get more information on the bug? How can I reproduce it?
msg96610 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-12-19 16:26
Enable core dumps (`ulimit -c unlimited`) and re-run your tests. The
fatal Python error will leave a core file so that you can find out the
context.

You can also try a debug build of Python which may give you more
information, assuming it can reproduce the bug.

(as for how to reproduce the bug, it's your task to find it out :-))
msg96658 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2009-12-20 01:09
Use a core dump: good idea!

haypo> Using my fuzzer (Fusil) on Python trunk, I got sometimes 
haypo> errors on multiprocessing.Pool():
haypo> 
haypo>    Fatal Python error: PyEval_AcquireThread: NULL new thread state

I read the source code of the thread module. This error means that 
PyThreadState_New() returns NULL which occurs if malloc() failed. I hit this 
error using my fuzzer because the fuzzer limits the total memory to something 
around 100 MB using setrlimit().

Said differently: in low memory condition, creating a new thread may exit the 
whole Python process if a memory allocation fail.

--

Sometimes, I get another error, similar to the "NULL new thread state" error:

   Fatal Python error: Couldn't create autoTLSkey mapping

I guess that the reason is the same: memory allocation failed. It should be 
the malloc() in find_key() (Python/thread.c).
msg96760 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009-12-21 16:46
We just had this error on one of the buildbots:

[...]
test_threadsignals
sem_wait: Unknown error 512
test_docxmlrpc
Fatal Python error: Invalid thread state for this thread
Fatal Python error: PyEval_AcquireThread: non-NULL old thread state
make: *** [buildbottest] Aborted


(see
http://www.python.org/dev/buildbot/all/builders/x86%20Ubuntu%20trunk/builds/161/steps/test/logs/stdio
)

I don't know it's related.
msg96778 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2009-12-21 22:37
No, I don't think that this issue is related, because it starts with
another message: "Invalid thread state for this thread".
msg96918 - (view) Author: Nir Aides (nirai) Date: 2009-12-27 19:19
Memory can be pre-allocated by thread_PyThread_start_new_thread() before 
thread is spawned.
msg97480 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-01-09 23:54
Here I come with a patch! nirai idea was the good one: prealloc PyThreadState before creating the thread. Raise a MemoryError if the allocation fail, instead of raising a *fatal* Python error.

Patch is quite simple and allow better error handling.
msg97567 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-01-11 01:07
The patch looks good; the line
    tstate->thread_id = PyThread_get_thread_ident();
is needed because the tstate is created in the main thread, but used in another thread.
msg97568 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-01-11 01:21
> The patch looks good

I'm not sure about autoTLSkey value. PyThreadState_New() calls _PyGILState_NoteThreadState() which checks that autoTLSkey is not zero, but I don't know if autoTLSkey have the right value with my preallocation patch.
msg97569 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-01-11 01:41
> I'm not sure about autoTLSkey value (...)

After some tests (printf fun), it looks ok.

--

There is another problem in _testcapi: test_thread_state() calls directly PyThread_start_new_thread() and the thread function calls PyGILState_Ensure(). PyGILState_Ensure() would then require to create a new thread state (call PyThreadState_New()) because it was not done by PyThread_start_new_thread(). On low memory condition, we hit the same bug here.

Only thread and _testcapi modules calls directly PyThread_start_new_thread(). _*test*capi is reserved to tests, so I consider that we don't care about bugs under low memory condition in this module.
msg97669 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-01-12 22:33
Running the tests in debug mode gives the following error:

test_3_join_in_forked_from_thread (test.test_threading.ThreadJoinOnShutdown) ... Fatal Python error: Invalid thread state for this thread
[21851 refs]
FAIL
[snip]

======================================================================
FAIL: test_3_join_in_forked_from_thread (test.test_threading.ThreadJoinOnShutdown)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/antoine/cpython/debug/Lib/test/test_threading.py", line 476, in test_3_join_in_forked_from_thread
    self._run_and_join(script)
  File "/home/antoine/cpython/debug/Lib/test/test_threading.py", line 412, in _run_and_join
    self.assertEqual(data, "end of main\nend of thread\n")
AssertionError: 'end of main\n' != 'end of main\nend of thread\n'

----------------------------------------------------------------------
Ran 78 tests in 3.739s
msg97678 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-01-13 01:45
> Running the tests in debug mode gives the following error:
> ... Fatal Python error: Invalid thread state for this thread

I tried all Lib/test/test_thread*py, but not in debug mode :-/

The problem is here: PyThreadState_New() -> _PyGILState_NoteThreadState() -> PyThread_set_key_value() -> find_key() and find_key() calls PyThread_get_thread_ident(), but the ident is not the right ident :-/

PyThreadState_New() should not call _PyGILState_NoteThreadState(), it should be done _in_ the thread.

I wrote a new patch fixing the unit test. PyThreadState_New() is part of the public API, so I can't change its prototype. And _PyGILState_NoteThreadState() is a private function (use the "static" keyword). Because of that, I introduced two new functions:

 - PyThreadState_Prealloc(): like PyThreadState_New() but for thread preallocation, can be called outside the new state (from another thread)

 - PyThreadState_Init(): have to be called in the new thread to finish the thread state initialization, only required for state created by PyThreadState_Prealloc() (not for PyThreadState_New())

I tried all Lib/test/test_thread*.py tests in debug mode and all test now pass ;-)
msg97714 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-01-13 13:45
Another problem with my patch! If initsite() calls PyGILState_Ensure(): assert(autoInterpreterState) fails because autoInterpreterState is NULL.

_PyGILState_Init() have to be called before initsite(). I don't know where it should be called exactly. Why not just after the PyThreadState_New() (in Py_InitializeEx())?
msg97715 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-01-13 13:47
> If initsite() calls PyGILState_Ensure() (...)

Note: I was using gdb to track a bug on a debug build (--with-pydebug). I used "pyo" macro which calls _PyObject_Dump(), and _PyObject_Dump() calls PyGILState_Ensure() => assertion error.
msg98106 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-01-21 11:36
Sum up of my patch:

 * it pass all test_thread*.py tests (tested with in pydebug mode)
 * it preallocates the thread state in the parent thread to be able to raise an error with PyErr_NoMemory() instead of Py_FatalError()
 * PyThreadState_Prealloc() doesn't call _PyGILState_NoteThreadState() because the thread ident is not correct in the parent thread
 * Call _PyGILState_NoteThreadState() in the new thread to finish the thread initialization
 * Py_InitializeEx() calls _PyGILState_Init() before initsite(), because initsite() may create a thread
msg98562 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-01-30 13:06
I didn't test the patch but some comments:
- PyThreadState_Prealloc and PyThreadState_Init should probably be prefixed with an underscore, because there's no use for them outside of the interpreter
- _PyThreadState_New should be static. Besides, static functions usually don't get the "Py" in their name and are all lowercased, so something like "new_threadstate"
- the last change ("Py_InitializeEx() calls _PyGILState_Init() before initsite()") should be part of a separate patch (and issue?)
msg98563 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-01-30 13:07
Oh, and besides, you can use the -R option to regrtest to find out if there are any reference leaks (e.g. "-R 3:2:").
msg98637 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-01-31 23:50
> PyThreadState_Prealloc and PyThreadState_Init should (...) be prefixed with an underscore (...)

done

> _PyThreadState_New should be static (...) so something like "new_threadstate"

done

> the last change ("Py_InitializeEx() calls _PyGILState_Init() before initsite()") should be part of a separate patch (and issue?)

I'm unable to reproduce the bug related to this patch. I keep the patch somewhere in my disk, and I will open a new issue if I'm able to reproduce it :-)

> you can use the -R option to regrtest to find out if there are any reference leaks (e.g. "-R 3:2:")

I tried all test*thread*py. test_threadsignals leaks a reference... even without my patch, so I opened a new issue: #7825.
msg100358 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-03-03 23:32
Commited: r78638 (trunk), r78639 (py3k), r78640 (3.1).

Keep the issue open to remember me that I have to backport it to 2.6 (the branch is frozen waiting for the 2.6.5 final version).
msg101422 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2010-03-21 13:33
> Commited: r78638 (trunk)

Backport done: r79199 (2.6).
History
Date User Action Args
2010-03-21 13:33:03hayposetstatus: open -> closed
resolution: fixed
messages: + msg101422
2010-03-03 23:32:53hayposetmessages: + msg100358
2010-01-31 23:51:06hayposetfiles: - thread_prealloc_pystate-3.patch
2010-01-31 23:50:58hayposetfiles: + thread_prealloc_pystate-4.patch

messages: + msg98637
2010-01-30 13:07:54pitrousetmessages: + msg98563
2010-01-30 13:06:50pitrousetmessages: + msg98562
2010-01-21 11:36:14hayposetmessages: + msg98106
2010-01-15 00:13:08hayposetfiles: - thread_prealloc_pystate-2.patch
2010-01-15 00:13:03hayposetfiles: - thread_prealloc_pystate.patch
2010-01-15 00:12:58hayposetfiles: + thread_prealloc_pystate-3.patch
2010-01-13 13:47:42hayposetmessages: + msg97715
2010-01-13 13:45:08hayposetmessages: + msg97714
2010-01-13 01:45:22hayposetfiles: + thread_prealloc_pystate-2.patch

messages: + msg97678
2010-01-12 22:33:18pitrousetmessages: + msg97669
2010-01-11 01:41:40hayposetmessages: + msg97569
2010-01-11 01:21:52hayposetmessages: + msg97568
2010-01-11 01:07:51amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg97567
2010-01-09 23:54:25hayposetfiles: + thread_prealloc_pystate.patch
keywords: + patch
messages: + msg97480
2009-12-27 19:19:12niraisetmessages: + msg96918
2009-12-27 18:28:58niraisetnosy: + nirai
2009-12-23 23:06:15hayposettitle: multiprocessing.Pool(): Fatal Python error: PyEval_AcquireThread: NULL new thread state -> Fatal error on thread creation in low memory condition
2009-12-21 22:37:16hayposetmessages: + msg96778
2009-12-21 16:46:07pitrousetmessages: + msg96760
2009-12-20 01:09:03hayposetmessages: + msg96658
2009-12-19 16:26:06pitrousetnosy: + pitrou, jnoller

messages: + msg96610
stage: test needed
2009-12-19 12:49:32haypocreate