Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal error on thread creation in low memory condition #51793

Closed
vstinner opened this issue Dec 19, 2009 · 20 comments
Closed

Fatal error on thread creation in low memory condition #51793

vstinner opened this issue Dec 19, 2009 · 20 comments
Labels
extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@vstinner
Copy link
Member

BPO 7544
Nosy @amauryfa, @pitrou, @vstinner
Files
  • thread_prealloc_pystate-4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-03-21.13:33:03.216>
    created_at = <Date 2009-12-19.12:49:32.499>
    labels = ['extension-modules', 'type-crash']
    title = 'Fatal error on thread creation in low memory condition'
    updated_at = <Date 2010-03-21.13:33:03.215>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2010-03-21.13:33:03.215>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2010-03-21.13:33:03.216>
    closer = 'vstinner'
    components = ['Extension Modules']
    creation = <Date 2009-12-19.12:49:32.499>
    creator = 'vstinner'
    dependencies = []
    files = ['16074']
    hgrepos = []
    issue_num = 7544
    keywords = ['patch']
    message_count = 20.0
    messages = ['96602', '96610', '96658', '96760', '96778', '96918', '97480', '97567', '97568', '97569', '97669', '97678', '97714', '97715', '98106', '98562', '98563', '98637', '100358', '101422']
    nosy_count = 5.0
    nosy_names = ['amaury.forgeotdarc', 'pitrou', 'vstinner', 'jnoller', 'nirai']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'test needed'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue7544'
    versions = ['Python 2.7']

    @vstinner
    Copy link
    Member Author

    Using my fuzzer (Fusil) on Python trunk, I got sometimes errors on
    multiprocessing.Pool():

    Fatal Python error: PyEval_AcquireThread: NULL new thread state

    I'm sorry but I don't have example script to reproduce the bug. I
    suppose that the error depends on the system load. I have the error on
    an Intel Core2 Quad Core Q9300 (CPU able to run 4 processes/threads at
    the time same).

    How can I get more information on the bug? How can I reproduce it?

    @vstinner vstinner added extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump labels Dec 19, 2009
    @pitrou
    Copy link
    Member

    pitrou commented Dec 19, 2009

    Enable core dumps (ulimit -c unlimited) and re-run your tests. The
    fatal Python error will leave a core file so that you can find out the
    context.

    You can also try a debug build of Python which may give you more
    information, assuming it can reproduce the bug.

    (as for how to reproduce the bug, it's your task to find it out :-))

    @vstinner
    Copy link
    Member Author

    Use a core dump: good idea!

    haypo> Using my fuzzer (Fusil) on Python trunk, I got sometimes
    haypo> errors on multiprocessing.Pool():
    haypo>
    haypo> Fatal Python error: PyEval_AcquireThread: NULL new thread state

    I read the source code of the thread module. This error means that
    PyThreadState_New() returns NULL which occurs if malloc() failed. I hit this
    error using my fuzzer because the fuzzer limits the total memory to something
    around 100 MB using setrlimit().

    Said differently: in low memory condition, creating a new thread may exit the
    whole Python process if a memory allocation fail.

    --

    Sometimes, I get another error, similar to the "NULL new thread state" error:

    Fatal Python error: Couldn't create autoTLSkey mapping

    I guess that the reason is the same: memory allocation failed. It should be
    the malloc() in find_key() (Python/thread.c).

    @pitrou
    Copy link
    Member

    pitrou commented Dec 21, 2009

    We just had this error on one of the buildbots:

    [...]
    test_threadsignals
    sem_wait: Unknown error 512
    test_docxmlrpc
    Fatal Python error: Invalid thread state for this thread
    Fatal Python error: PyEval_AcquireThread: non-NULL old thread state
    make: *** [buildbottest] Aborted

    (see
    http://www.python.org/dev/buildbot/all/builders/x86%20Ubuntu%20trunk/builds/161/steps/test/logs/stdio
    )

    I don't know it's related.

    @vstinner
    Copy link
    Member Author

    No, I don't think that this issue is related, because it starts with
    another message: "Invalid thread state for this thread".

    @vstinner vstinner changed the title multiprocessing.Pool(): Fatal Python error: PyEval_AcquireThread: NULL new thread state Fatal error on thread creation in low memory condition Dec 23, 2009
    @nirai
    Copy link
    Mannequin

    nirai mannequin commented Dec 27, 2009

    Memory can be pre-allocated by thread_PyThread_start_new_thread() before
    thread is spawned.

    @vstinner
    Copy link
    Member Author

    vstinner commented Jan 9, 2010

    Here I come with a patch! nirai idea was the good one: prealloc PyThreadState before creating the thread. Raise a MemoryError if the allocation fail, instead of raising a *fatal* Python error.

    Patch is quite simple and allow better error handling.

    @amauryfa
    Copy link
    Member

    The patch looks good; the line
    tstate->thread_id = PyThread_get_thread_ident();
    is needed because the tstate is created in the main thread, but used in another thread.

    @vstinner
    Copy link
    Member Author

    The patch looks good

    I'm not sure about autoTLSkey value. PyThreadState_New() calls _PyGILState_NoteThreadState() which checks that autoTLSkey is not zero, but I don't know if autoTLSkey have the right value with my preallocation patch.

    @vstinner
    Copy link
    Member Author

    I'm not sure about autoTLSkey value (...)

    After some tests (printf fun), it looks ok.

    --

    There is another problem in _testcapi: test_thread_state() calls directly PyThread_start_new_thread() and the thread function calls PyGILState_Ensure(). PyGILState_Ensure() would then require to create a new thread state (call PyThreadState_New()) because it was not done by PyThread_start_new_thread(). On low memory condition, we hit the same bug here.

    Only thread and _testcapi modules calls directly PyThread_start_new_thread(). _*test*capi is reserved to tests, so I consider that we don't care about bugs under low memory condition in this module.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 12, 2010

    Running the tests in debug mode gives the following error:

    test_3_join_in_forked_from_thread (test.test_threading.ThreadJoinOnShutdown) ... Fatal Python error: Invalid thread state for this thread
    [21851 refs]
    FAIL
    [snip]

    ======================================================================
    FAIL: test_3_join_in_forked_from_thread (test.test_threading.ThreadJoinOnShutdown)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/home/antoine/cpython/debug/Lib/test/test_threading.py", line 476, in test_3_join_in_forked_from_thread
        self._run_and_join(script)
      File "/home/antoine/cpython/debug/Lib/test/test_threading.py", line 412, in _run_and_join
        self.assertEqual(data, "end of main\nend of thread\n")
    AssertionError: 'end of main\n' != 'end of main\nend of thread\n'

    Ran 78 tests in 3.739s

    @vstinner
    Copy link
    Member Author

    Running the tests in debug mode gives the following error:
    ... Fatal Python error: Invalid thread state for this thread

    I tried all Lib/test/test_thread*py, but not in debug mode :-/

    The problem is here: PyThreadState_New() -> _PyGILState_NoteThreadState() -> PyThread_set_key_value() -> find_key() and find_key() calls PyThread_get_thread_ident(), but the ident is not the right ident :-/

    PyThreadState_New() should not call _PyGILState_NoteThreadState(), it should be done _in_ the thread.

    I wrote a new patch fixing the unit test. PyThreadState_New() is part of the public API, so I can't change its prototype. And _PyGILState_NoteThreadState() is a private function (use the "static" keyword). Because of that, I introduced two new functions:

    • PyThreadState_Prealloc(): like PyThreadState_New() but for thread preallocation, can be called outside the new state (from another thread)

    • PyThreadState_Init(): have to be called in the new thread to finish the thread state initialization, only required for state created by PyThreadState_Prealloc() (not for PyThreadState_New())

    I tried all Lib/test/test_thread*.py tests in debug mode and all test now pass ;-)

    @vstinner
    Copy link
    Member Author

    Another problem with my patch! If initsite() calls PyGILState_Ensure(): assert(autoInterpreterState) fails because autoInterpreterState is NULL.

    _PyGILState_Init() have to be called before initsite(). I don't know where it should be called exactly. Why not just after the PyThreadState_New() (in Py_InitializeEx())?

    @vstinner
    Copy link
    Member Author

    If initsite() calls PyGILState_Ensure() (...)

    Note: I was using gdb to track a bug on a debug build (--with-pydebug). I used "pyo" macro which calls _PyObject_Dump(), and _PyObject_Dump() calls PyGILState_Ensure() => assertion error.

    @vstinner
    Copy link
    Member Author

    Sum up of my patch:

    • it pass all test_thread*.py tests (tested with in pydebug mode)
    • it preallocates the thread state in the parent thread to be able to raise an error with PyErr_NoMemory() instead of Py_FatalError()
    • PyThreadState_Prealloc() doesn't call _PyGILState_NoteThreadState() because the thread ident is not correct in the parent thread
    • Call _PyGILState_NoteThreadState() in the new thread to finish the thread initialization
    • Py_InitializeEx() calls _PyGILState_Init() before initsite(), because initsite() may create a thread

    @pitrou
    Copy link
    Member

    pitrou commented Jan 30, 2010

    I didn't test the patch but some comments:

    • PyThreadState_Prealloc and PyThreadState_Init should probably be prefixed with an underscore, because there's no use for them outside of the interpreter
    • _PyThreadState_New should be static. Besides, static functions usually don't get the "Py" in their name and are all lowercased, so something like "new_threadstate"
    • the last change ("Py_InitializeEx() calls _PyGILState_Init() before initsite()") should be part of a separate patch (and issue?)

    @pitrou
    Copy link
    Member

    pitrou commented Jan 30, 2010

    Oh, and besides, you can use the -R option to regrtest to find out if there are any reference leaks (e.g. "-R 3:2:").

    @vstinner
    Copy link
    Member Author

    PyThreadState_Prealloc and PyThreadState_Init should (...) be prefixed with an underscore (...)

    done

    _PyThreadState_New should be static (...) so something like "new_threadstate"

    done

    the last change ("Py_InitializeEx() calls _PyGILState_Init() before initsite()") should be part of a separate patch (and issue?)

    I'm unable to reproduce the bug related to this patch. I keep the patch somewhere in my disk, and I will open a new issue if I'm able to reproduce it :-)

    you can use the -R option to regrtest to find out if there are any reference leaks (e.g. "-R 3:2:")

    I tried all test*thread*py. test_threadsignals leaks a reference... even without my patch, so I opened a new issue: bpo-7825.

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 3, 2010

    Commited: r78638 (trunk), r78639 (py3k), r78640 (3.1).

    Keep the issue open to remember me that I have to backport it to 2.6 (the branch is frozen waiting for the 2.6.5 final version).

    @vstinner
    Copy link
    Member Author

    Commited: r78638 (trunk)

    Backport done: r79199 (2.6).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants