Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread local storage and PyGILState_* mucked up by os.fork() #46024

Closed
roudkerk mannequin opened this issue Dec 21, 2007 · 13 comments
Closed

Thread local storage and PyGILState_* mucked up by os.fork() #46024

roudkerk mannequin opened this issue Dec 21, 2007 · 13 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@roudkerk
Copy link
Mannequin

roudkerk mannequin commented Dec 21, 2007

BPO 1683
Nosy @loewis, @warsaw, @gpshead, @tiran, @benjaminp
Files
  • dbg2.py
  • fork-thread-patch
  • fork-thread-patch-2
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2008-12-13.14:59:36.810>
    created_at = <Date 2007-12-21.18:53:46.372>
    labels = ['interpreter-core', 'type-crash', 'release-blocker']
    title = 'Thread local storage and PyGILState_* mucked up by os.fork()'
    updated_at = <Date 2008-12-13.14:59:36.808>
    user = 'https://bugs.python.org/roudkerk'

    bugs.python.org fields:

    activity = <Date 2008-12-13.14:59:36.808>
    actor = 'loewis'
    assignee = 'none'
    closed = True
    closed_date = <Date 2008-12-13.14:59:36.810>
    closer = 'loewis'
    components = ['Interpreter Core']
    creation = <Date 2007-12-21.18:53:46.372>
    creator = 'roudkerk'
    dependencies = []
    files = ['9023', '9076', '10595']
    hgrepos = []
    issue_num = 1683
    keywords = []
    message_count = 13.0
    messages = ['58954', '59379', '59576', '68024', '68031', '68036', '68038', '68109', '76535', '76597', '76627', '76658', '77739']
    nosy_count = 9.0
    nosy_names = ['loewis', 'barry', 'gregory.p.smith', 'Rhamphoryncus', 'christian.heimes', 'schmir', 'roudkerk', 'benjamin.peterson', 'jnoller']
    pr_nums = []
    priority = 'release blocker'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue1683'
    versions = ['Python 2.5.3']

    @roudkerk
    Copy link
    Mannequin Author

    roudkerk mannequin commented Dec 21, 2007

    I got a report that one of the tests for processing
    (http://cheeseshop.python.org/pypi/processing) was failing with

    Fatal Python error: Invalid thread state for this thread
    

    when run with a debug interpreter. This appears to be caused by the
    interaction between os.fork() and threads. The following attached
    program reliably reproduces the problem for me on Ubuntu edgy for x86.

    All that happens is that the parent process starts a subthread then
    forks, and then the child process starts a subthread.

    With the normal interpreter I get

    started thread -1211028576 in process 18683
    I am thread -1211028576 in process 18683
    started thread -1211028576 in process 18685
    I am thread -1211028576 in process 18685
    

    as expected, but with the debug interpreter I get

    started thread -1210782816 in process 18687
    I am thread -1210782816 in process 18687
    started thread -1210782816 in process 18689
    Fatal Python error: Invalid thread state for this thread
    [5817 refs]
    

    Notice that the child process is reusing a thread id that was
    being used by the parent process at the time of the fork.

    The code raising the error seems to be in pystate.c:

    PyThreadState *
    PyThreadState_Swap(PyThreadState *new)
    {
            PyThreadState *old = _PyThreadState_Current;
    
                _PyThreadState_Current = new;
                /* It should not be possible for more than one thread state
                   to be used for a thread.  Check this the best we can in
    debug 
                   builds.
                */
        #if defined(Py_DEBUG) && defined(WITH_THREAD)
                if (new) {
                        PyThreadState *check = PyGILState_GetThisThreadState();
                        if (check && check != new)
                                Py_FatalError("Invalid thread state for this
    thread");
                }
        #endif
                return old;
        }

    It looks as though PyGILState_GetThisThreadState() is returning the
    thread state of the thread from the parent process which has the same
    id as the current thread. Therefore the check fails.

    I think the thread local storage implementation in thread.c should
    provide a function _PyThread_ReInitTLS() which PyOS_AfterFork() can
    call. I think _PyThread_ReInitTLS() just needs to remove and free
    each "struct key" in the linked list which does not match the id of
    the calling thread.

    @roudkerk roudkerk mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Dec 21, 2007
    @roudkerk
    Copy link
    Mannequin Author

    roudkerk mannequin commented Jan 6, 2008

    The included patch against python2.51 fixes the problem for me.

    @tiran
    Copy link
    Member

    tiran commented Jan 9, 2008

    Bug day task

    @gpshead
    Copy link
    Member

    gpshead commented Jun 11, 2008

    we need this in before 2.6 is released.

    @gpshead gpshead self-assigned this Jun 11, 2008
    @benjaminp
    Copy link
    Contributor

    Gregory, go ahead and apply and see if can stop the hell in the buildbots.

    @Rhamphoryncus
    Copy link
    Mannequin

    Rhamphoryncus mannequin commented Jun 11, 2008

    Updated version of roudkerk's patch. Adds the new function to
    pythread.h and is based off of current trunk.

    Note that Parser/intrcheck.c isn't used on my box, so it's completely
    untested.

    roudkerk's original analysis is correct. The TLS is never informed that
    the old thread is gone, so when it seems the same id again it assumes it
    is the old thread, which PyThreadState_Swap doesn't like.

    @Rhamphoryncus
    Copy link
    Mannequin

    Rhamphoryncus mannequin commented Jun 11, 2008

    Incidentally, it doesn't seem necessary to reinitialize the lock. Posix
    duplicates the lock, so if you hold it when you fork your child will be
    able to unlock it and use it as normal. Maybe there's some non-Posix
    behaviour or something even more obscure when bpo-401226 was done?

    (reinitializing is essentially harmless though, so in no way should this
    hold up release.)

    @benjaminp
    Copy link
    Contributor

    I applied this in r64212.

    @tiran
    Copy link
    Member

    tiran commented Nov 28, 2008

    The fix is required to run multiprocessing on Python 2.4 and 2.5, see
    bpo-4451. I suggest we fix the issue in 2.5.3. The fork-thread-patch-2
    patch doesn't work on Python 2.5. I'm getting a segfault on my system:

    test_connection (multiprocessing.tests.WithProcessesTestConnection) ...
    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0x7fa2e999f6e0 (LWP 10594)]
    0x000000000052065f in PyCFunction_Call (func=Cannot access memory at
    address 0x7ffeffffffd8
    ) at Objects/methodobject.c:73
    73 return (*meth)(self, arg);

    Linux on AMD64 with Python 2.5 svn --with-pydebug.

    @tiran tiran reopened this Nov 28, 2008
    @gpshead gpshead assigned tiran and unassigned gpshead Nov 28, 2008
    @tiran
    Copy link
    Member

    tiran commented Nov 29, 2008

    I was wrong and the patch is right. Something is wrong in
    multiprocessings connection_recvbytes_into() function for the old buffer
    protocol. Somehow PyArg_ParseTuple(args, "w#|" ...) fucks up the
    process' memory.

    Martin, are you fine with the patch? fork-thread-patch-2 still applies
    with some fuzzying.

    @tiran tiran assigned loewis and unassigned tiran Nov 29, 2008
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Nov 29, 2008

    Looks fine to me, somebody please backport it.

    I'm concerned about the memory leak that this may cause, though (IIUC):
    all the values are garbage, provided that they are pointers to dynamic
    memory in the first place.

    I think in the long run, we should allow to store a pointer to a release
    function along with actual value to release.

    @loewis loewis mannequin added the release-blocker label Nov 29, 2008
    @loewis loewis mannequin removed their assignment Nov 29, 2008
    @warsaw
    Copy link
    Member

    warsaw commented Nov 30, 2008

    Since this is a Python 2.5.3 issue, I'm lowering to deferred blocker
    until after 3.0 and 2.6.1 are released.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Dec 13, 2008

    Committed fork-thread-patch-2 as r67736 into 2.5 branch.

    @loewis loewis mannequin closed this as completed Dec 13, 2008
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants