classification
Title: test_concurrent_futures crashes with "--with-pydebug" on RHEL5 with "Fatal Python error: Invalid thread state for this thread"
Type: crash Stage: resolved
Components: Extension Modules Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: bquinlan Nosy List: bquinlan, dmalcolm, grahamd, jcea, jnoller, kristjan.jonsson, lukasz.langa, neologix, pitrou, python-dev, sandro.tosi, ysj.ray
Priority: critical Keywords: patch

Created on 2010-11-24 00:24 by lukasz.langa, last changed 2013-02-04 19:18 by jcea. This issue is now closed.

Files
File name Uploaded Description Edit
concurrent-futures-freeze.tar.bz2 lukasz.langa, 2010-11-24 00:24 Diagnostic logs
test_specific.c neologix, 2011-04-15 19:00 test program
tls_reinit.diff neologix, 2011-04-27 16:04
Messages (38)
msg122254 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-11-24 00:24
py3k built from trunk on Centos 5.5 freezes during regrtest on test_concurrent_futures with "Fatal Python error: Invalid thread state for this thread".

A set of hopefully useful diagnostic logs attached as patch.
msg122255 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2010-11-24 00:32
A colorful example: http://bpaste.net/show/11493/

(just in case if downloading and extracting logs is not feasible)

Some clarification: as in a typical concurrent problem, subsequent calls freeze in different test cases, but the freeze itself is always reproducible and always during this test.
msg122287 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-11-24 16:28
I'm able to reliably reproduce this on a RHEL 5 box (i386 in this case).

All of the "ProcessPool*" unittest subclasses within Lib/test/test_concurrent_futures.py exhibit this hang, each printing out just the name of the first test (so presumably either within the first test method, or in shared setup/teardown).

None of the other subclasses hang.

You need to build with --with-pydebug to see this: the error message is coming from this code in PyThreadState_Swap in Python/pystate.c:

   390  #if defined(Py_DEBUG) && defined(WITH_THREAD)
   391      if (newts) {
   392          /* This can be called from PyEval_RestoreThread(). Similar
   393             to it, we need to ensure errno doesn't change.
   394          */
   395          int err = errno;
   396          PyThreadState *check = PyGILState_GetThisThreadState();
   397          if (check && check->interp == newts->interp && check != newts)
>>>398              Py_FatalError("Invalid thread state for this thread");
   399          errno = err;
   400      }
   401  #endif
msg122288 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-11-24 16:47
Minimal reproducer:
$ ./python -c "from concurrent.futures import * ; e = ProcessPoolExecutor() ; e.submit(pow, 2, 5)"
Fatal Python error: Invalid thread state for this thread
msg122290 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-11-24 16:54
Seems to be an issue within (or triggered by) multiprocessing (test_threads and test_threading pass OK, fwiw):
$ ./python -m test.test_multiprocessing
Fatal Python error: Invalid thread state for this thread
Traceback (most recent call last):
  File "/home/dmalcolm/coding/python-svn/py3k-clean/Lib/runpy.py", line 160, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/home/dmalcolm/coding/python-svn/py3k-clean/Lib/runpy.py", line 73, in _run_code
    exec(code, run_globals)
  File "/home/dmalcolm/coding/python-svn/py3k-clean/Lib/test/test_multiprocessing.py", line 2127, in <module>
    main()
  File "/home/dmalcolm/coding/python-svn/py3k-clean/Lib/test/test_multiprocessing.py", line 2124, in main
    test_main(unittest.TextTestRunner(verbosity=2).run)
  File "/home/dmalcolm/coding/python-svn/py3k-clean/Lib/test/test_multiprocessing.py", line 2103, in test_main
    ManagerMixin.pool = ManagerMixin.manager.Pool(4)
  File "/home/dmalcolm/coding/python-svn/py3k-clean/Lib/multiprocessing/managers.py", line 644, in temp
    token, exp = self._create(typeid, *args, **kwds)
  File "/home/dmalcolm/coding/python-svn/py3k-clean/Lib/multiprocessing/managers.py", line 542, in _create
    conn = self._Client(self._address, authkey=self._authkey)
  File "/home/dmalcolm/coding/python-svn/py3k-clean/Lib/multiprocessing/connection.py", line 149, in Client
    answer_challenge(c, authkey)
  File "/home/dmalcolm/coding/python-svn/py3k-clean/Lib/multiprocessing/connection.py", line 383, in answer_challenge
    message = connection.recv_bytes(256)         # reject large message
EOFError
msg122295 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-11-24 18:48
By strategically adding print() and input() calls, I was able to isolate the error to this line in test_multiprocessing.py's test_main:
   ManagerMixin.pool = ManagerMixin.manager.Pool(4)

specifically, to the construction:
  ManagerMixin.manager.Pool(4)

Minimal reproducer seems to be:
>>> import multiprocessing.managers
>>> mpp = multiprocessing.Pool(4)
>>> sm = multiprocessing.managers.SyncManager()
>>> sm.start()

i.e.:

$ ./python -c "import multiprocessing.managers ; mpp = multiprocessing.Pool(4); sm = multiprocessing.managers.SyncManager(); sm.start()"
Fatal Python error: Invalid thread state for this thread
msg122320 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2010-11-25 00:13
FWIW, I was able to do an almost full run of regrtest on this box with:
  -x test_multiprocessing test_concurrent_futures

Other than those two, all tests pass.

So _something_ is going wrong w.r.t. threads, though I'm not sure what at this stage.
msg123434 - (view) Author: Brian Quinlan (bquinlan) (Python committer) Date: 2010-12-05 19:00
I've filed a new bug (http://bugs.python.org/issue10632) against multiprocessing and this bug dependent on it.

In the meantime, I can't repro this on ubuntu 10.04 LTS so I'm going to install Centos and give that a go.
msg123604 - (view) Author: ysj.ray (ysj.ray) Date: 2010-12-08 14:01
Couldn't repro this on my debian 5.
msg128329 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2011-02-10 19:01
I spent some time bisecting the SVN history in the py3k branch, and believe that r84914 is the commit that introduced this issue.

Details:
  
  Trying on 4-core i386 RHEL 5 box
  $ svn up -r REV
  $ make clean ; make
  (configured --with-pydebug)

Reproducer:

  ./python -c "import multiprocessing.managers ; mpp = multiprocessing.Pool(4); sm = multiprocessing.managers.SyncManager(); sm.start()"

  (running it 3 times; each time, at each build I see it either successfully completing 3 times, or having the Py_FatalError 3 times)

Bisecting:
  r86720: CRASHES with r86729 --with-pydebug (from when I was investigating this before)
  r73573: Runs OK with r73573 --with-pydebug (r73573 was origin for 3.1 tag)

    => somewhere in 73572..86729

  r76195: Runs OK with r76195 --with-pydebug (r76195 added the new GIL)

    => somewhere in 76195..86729

  r80724: Runs OK with r80724 --with-pydebug (changed threading code to "Make (most of) Python's tests pass under Thread Sanitizer.")

    => somewhere in 80724..86729

  r83722: Runs OK with r80724 --with-pydebug (touched multiprocessing)

    => somewhere in 83722..86729

  r85222: CRASHES with r85222 --with-pydebug (rough midpoint)

    => somewhere in 83722..85222

  r84472: Runs OK with r84472 --with-pydebug (arbitrary midpoint)

    => somewhere in 84472..85222

  r84847: Runs OK with r84847 --with-pydebug (arbitrary midpoint)

     => somewhere in 84847..85222

  r85033: CRASHES with r85033 --with-pydebug (arbitrary midpoint)

     => somewhere in 84847..85033

  r84938: CRASHES with r84938 --with-pydebug (arbitrary midpoint)

     => somewhere in 84847..84938

  r84892: Runs OK with r84892 --with-pydebug (arbitrary midpoint)

     => somewhere in 84892..84938

  r84915: CRASHES with r84915 --with-pydebug (arbitrary midpoint)

     => somewhere in 84892..84915

  r84903: Runs OK with r84903 --with-pydebug (arbitrary midpoint)

     => somewhere in 84903..84915

  r84914: CRASHES with r84914 --with-pydebug (affects threads)

     => somewhere in 84903..84914

  r84913: Runs OK with r84913 --with-pydebug (previous commit before 84914)

     => r84914 is the commit that triggered this issue
msg128330 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2011-02-10 19:03
r84914 was the implementation of issue 9786 (Native TLS support for pthreads)
msg128343 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2011-02-10 21:59
This appears to be happening in a child process when the parent process is running:
  Lib/multiprocessing/util.py, line 255, in _exit_function ()

Liberally adding printf() and getpid() calls in various places, seems to always happen when parent process is within "call_py_exitfuncs()" within Py_Finalize; error is from the child process that was created last.

Using gdb with a breakpoint on "call_py_exitfuncs" and single-stepping seems to confirm a single exitfunc:
  Lib/multiprocessing/util.py, line 255, in _exit_function ()
and that the child dies as that bytecode function is executed.

$ ./python -c "import multiprocessing.managers ; mpp = multiprocessing.Pool(4); sm = multiprocessing.managers.SyncManager(); sm.start()"
Py_InitializeEx called for PID 27824
posix_fork called by PID 27824
child of posix_fork has PID 27825
posix_fork called by PID 27824
child of posix_fork has PID 27826
posix_fork called by PID 27824
child of posix_fork has PID 27827
posix_fork called by PID 27824
child of posix_fork has PID 27828
posix_fork called by PID 27824
child of posix_fork has PID 27832
Py_Finalize called for PID 27824
wait_for_thread_shutdown() finished for PID 27824
Fatal Python error for PID 27832: Invalid thread state for this thread
call_py_exitfuncs() finished for PID 27824
PyOS_FiniInterrupts() finished for PID 27824
[64240 refs]
msg132428 - (view) Author: Sandro Tosi (sandro.tosi) * (Python committer) Date: 2011-03-28 21:58
Is someone still able to replicate this crash? I'm not, with a fresh built 3.2 and default (3.3), --with-pydebug enabled. Brian confirmed on msg132418 that he can't any longer replicate it.
msg132444 - (view) Author: Dave Malcolm (dmalcolm) (Python committer) Date: 2011-03-28 22:59
I tried again, and I'm still able to reproduce this bug on a RHEL5 box with cpython --with-pydebug as of a recent checkout (69030:00217100b9e7 as it happens):

$ ./python -c "import multiprocessing.managers ; mpp = multiprocessing.Pool(4); sm = multiprocessing.managers.SyncManager(); sm.start()"
Fatal Python error: Invalid thread state for this thread
[66448 refs]
msg132477 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2011-03-29 10:47
What is the line that the parent process is executing?  Line numbers don't seem to match any more.
And is it possible to set a breakpoint in the child process where the fatal error is triggered?  It would be good to know what is being run at that point.
msg133864 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-15 19:00
This is due to a bug in the TLS key management when mixed with fork.
Here's what happens:
When a thread is created, a tstate is allocated and stored in the thread's TLS:
thread_PyThread_start_new_thread -> t_bootstrap -> _PyThreadState_Init -> _PyGILState_NoteThreadState:

    if (PyThread_set_key_value(autoTLSkey, (void *)tstate) < 0)
        Py_FatalError("Couldn't create autoTLSkey mapping");

where 
int
PyThread_set_key_value(int key, void *value)
{
    int fail;
    void *oldValue = pthread_getspecific(key);
    if (oldValue != NULL)
        return 0;
    fail = pthread_setspecific(key, value);
    return fail;
}

A pthread_getspecific(key) is performed to see if there was already a value associated to this key.
The problem is that, if a process has a thread with a given thread ID (and a tstate stored in its TLS), and then the process forks (from another thread), if a new thread is created with the same thread ID as the thread in the child process, pthread_getspecific(key) will return the value stored by the other thread (with the same thread ID). In short, thread-specific values are inherited across fork, and if you're unlucky and create a thread with a thread ID already existing in the parent process, you're screwed.
To conclude, PyGILState_GetThisThreadState, which calls PyThread_get_key_value(autoTLSkey) will return the other thread's tstate, which will triggers this fatal error in PyThreadState_Swap.

The patch attached fixes this issue by removing the call to pthread_getspecific(key) from PyThread_set_key_value. This solves the problem and doesn't seem to cause any regression in test_threading and test_multiprocessing, and I think that if we were to call PyThread_set_key_value twice on the same key it's either an error, or we want the last version to be stored, not the old one.
test_threading and test_multiprocessing now run fine without any fatal error.

Note that this is probably be a bug in RHEL pthread's implementation, but given how widespread RHEL and derived distros are, I think this should be fixed.
I've attached a patch and a small test program to check if thread-specific data is inherited across a fork.
Here's a sample run on a RHEL4U8 box:

$ /tmp/test
PID: 17922, TID: 3086187424, init value: (nil)
PID: 17924, TID: 3086187424, init value: 0xdeadbeef

The second thread has been created in the child process and inherited the first thread's (created by the parent) key's value (one condition for this to happen is of course that the second thread is allocated the same thread ID as the first one).
msg133865 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-15 19:22
Note: this seems to be fixed in RHEL6.
(Sorry for the noise).
msg133866 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2011-04-15 19:23
Now, I'd be super happy to see this strange semantics of PyThread_set_key_value go away.  Its very un-standard and complicates the mapping from an native implementation to the python one.
But I think I did once bring up this issue, and was told that it was a bad idea.
But your logic is sound.  Doing two Sets, is an error regardless.  Hiding the error by ignoring the second set is arbitrarily as bad as ignoring the first thing.
So, if it is possible to fix this and remove this weird special case and cast it into the abyss, then by all means, you have my 10 thumbs up.  Not that it counts for much :)
msg134415 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-25 21:24
> So, if it is possible to fix this and remove this weird special case and cast it into the abyss, then by all means, you have my 10 thumbs up.  Not that it counts for much :)

Me too.
We still have a couple hundred RHEL4/5 boxes at work, and I guess we're not alone in this case. It's a really specific case, but I think it would be nice to fix it, especially since it also makes the code more understandable and less error-prone. Unless of course this special treatment is really necessary, in which case I'll have to think of another solution or just drop it.
I'm adding Antoine to the noisy list, since he's noted as thread expert in the Experts Index.
msg134418 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-04-25 21:39
> I think that if we were to call PyThread_set_key_value twice on the
> same key it's either an error, or we want the last version to be
> stored, not the old one.

Not necessarily. You can have several interpreters (and therefore several thread states) in a single thread, using Py_NewInterpreter(). It's used by mod_wsgi and probably other software. If you overwrite the old value with the new one, it may break such software.

Would it be possible to cleanup the autoTLS mappings in PyOS_AfterFork() instead?
msg134448 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-26 08:24
> Not necessarily. You can have several interpreters (and therefore several thread states) in a single thread, using Py_NewInterpreter(). It's used by mod_wsgi and probably other software. If you overwrite the old value with the new one, it may break such software.
>

OK, I didn't know. Better not to change that in that case.

> Would it be possible to cleanup the autoTLS mappings in PyOS_AfterFork() instead?
>

Well, after fork, all threads have exited, so you'll be running on the
behalf of the child process' main - and only - thread, so by
definition you can't access other threads' thread-specific data, no?
As an alternate solution, I was thinking of calling
PyThread_delete_key_value(autoTLSkey) in the path of thread bootstrap,
i.e. starting in Modules/_threadmodule.c t_bootstrap. Obviously, this
should be done before calling _PyThreadState_Init, since it can also
be called from Py_NewInterpreter.
The problem is that it would require exporting autoTLSkey whose scope
is now limited to pystate.c (we could also create a small wrapper
function in pystate.c to delete the autoTLSkey, since it's already
done in PyThreadState_DeleteCurrent and PyThreadState_Delete).
msg134471 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-04-26 15:18
> Well, after fork, all threads have exited, so you'll be running on the
> behalf of the child process' main - and only - thread, so by
> definition you can't access other threads' thread-specific data, no?

A rather good point :)
How about deleting the mapping (pthread_key_delete) and recreating it
from scratch, then?

> As an alternate solution, I was thinking of calling
> PyThread_delete_key_value(autoTLSkey) in the path of thread bootstrap,
> i.e. starting in Modules/_threadmodule.c t_bootstrap.

That would somewhat alleviate the problem, but only for Python-created
threads. Threads created through other means (for example mod_wsgi, or
database wrappers having their own thread pools) would still face the
original issue.
msg134475 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-26 16:20
> How about deleting the mapping (pthread_key_delete) and recreating it
> from scratch, then?

Sounds good.
So the idea would be to retrieve the current thread's tstate, destroy the current autoTLSkey, re-create it, and re-associate the current tstate to this new key. I just did a quick test on RHEL4 and it works.
PyThread_ReinitTLS looks like a good candidate for that, but it's the same problem, autoTLSkey scope is limited to pystates.c (and I'm not sure that the tstate should be exposed to platform thread implementations).
There's also PyEval_ReinitThreads in ceval.c, exposing the autoTLSkey would make more sense (and it already knows about tstate, of course).
Where would you put it?
msg134476 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-04-26 16:26
> > How about deleting the mapping (pthread_key_delete) and recreating it
> > from scratch, then?
> 
> Sounds good.
> So the idea would be to retrieve the current thread's tstate, destroy
> the current autoTLSkey, re-create it, and re-associate the current
> tstate to this new key. I just did a quick test on RHEL4 and it works.
> PyThread_ReinitTLS looks like a good candidate for that, but it's the
> same problem, autoTLSkey scope is limited to pystates.c (and I'm not
> sure that the tstate should be exposed to platform thread
> implementations).
> There's also PyEval_ReinitThreads in ceval.c, exposing the autoTLSkey
> would make more sense (and it already knows about tstate, of course).
> Where would you put it?

You could add a new _PyGILState_ReInit() function and call it from
PyOS_AfterFork() or PyEval_ReInitThreads().

(perhaps you also need to add a TLS-destroying function to thread.c, I
haven't looked)
msg134570 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2011-04-27 14:40
Antoine, I wonder if we can restore PyThread_set_key_value to behave like a canonical TLS api function (always setting) but fix the cases that want to "set if it has not already been set" like the cases you mention.
It is very unorthodox to have such "only set if it hasn't been set before" built into your only TLS function.  This wart on python's TLS api has bugged me for yearsand it would be cool to fix it.

The init functions (that internally call the python TLS apis) could simply do a TLS get explicitly themselves, to make it explicit and clear that they _want_ to use any pre-existing tls value.

Of course, that won't fix _this_ problem (which is that the main thread's tls value gets inherited on fork).  The right way to do that is to explicitly clearthe main thread's TLS value after fork...
msg134573 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-04-27 14:57
> Antoine, I wonder if we can restore PyThread_set_key_value to behave
> like a canonical TLS api function (always setting) but fix the cases
> that want to "set if it has not already been set" like the cases you
> mention.
> It is very unorthodox to have such "only set if it hasn't been set
> before" built into your only TLS function.  This wart on python's TLS
> api has bugged me for yearsand it would be cool to fix it.

Well, these functions are supposed to be private so, while I agree their
behaviour is a bit unusual, I'm not sure there's any point to "fix" them
if it shifts the burden of reproducing the old behaviour on another part
of our code.

> The init functions (that internally call the python TLS apis) could
> simply do a TLS get explicitly themselves, to make it explicit and
> clear that they _want_ to use any pre-existing tls value.

Granted.

> Of course, that won't fix _this_ problem (which is that the main
> thread's tls value gets inherited on fork).  The right way to do that
> is to explicitly clearthe main thread's TLS value after fork...

The main thread is fine, actually, it's the other (disappeared) threads
which cause this problem when the same TID is re-used.
msg134577 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2011-04-27 15:21
Ah, using the fallback implementation of tls?  Surely this isn't a problem with the pthreads tls, I'd be surprised if it retains TLS values after fork.
msg134580 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-27 15:30
> Ah, using the fallback implementation of tls?  Surely this isn't a 
> problem with the pthreads tls, I'd be surprised if it retains TLS values 
> after fork.

It surprised me too when I found that out, but it's really with the pthread TLS, on RHEL 4 and 5 (fixed in RHEL6).
See the attached test_specific.c test script.

> You could add a new _PyGILState_ReInit() function and call it from
> PyOS_AfterFork() or PyEval_ReInitThreads().

See attached tls_reinit.diff patch.
But I really find this redundant with PyThread_ReInitTLS, because what we're really doing is reinit the TLS.
Also, this calls this for every thread implementation, while it's only necessary for pthreads (and for other implementation it will redo the work done by PyThread_ReInitTLS).
So I've written another patch which does this in pthread's PyThread_ReInitTLS.

You've got much more experience than me, so it's really your call.
Actually, I kind of feel bad for adding such a hack for a pthread's bug affecting only RHEL 4 and 5, I'm wondering whether it's really worth fixing it.
msg134582 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-04-27 15:35
> > You could add a new _PyGILState_ReInit() function and call it from
> > PyOS_AfterFork() or PyEval_ReInitThreads().
> 
> See attached tls_reinit.diff patch.

Thank you. I like this patch, except that _PyGILState_ReInit() should be
declared in the appropriate .h file, not in signalmodule.c.
msg134585 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-27 15:42
> Thank you. I like this patch, except that _PyGILState_ReInit() should be
> declared in the appropriate .h file, not in signalmodule.c.

I asked myself this question when writing the patch: what's the convention  regarding functions ? Should they always be declared in a header with PyAPI_FUNC, or should this be reserved to functions exported through the API?
I've seen a couple external function declarations in several places, so I was wondering (and since this one isn't meant to be exported, I chose the later option).
msg134587 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-04-27 15:46
> > Thank you. I like this patch, except that _PyGILState_ReInit() should be
> > declared in the appropriate .h file, not in signalmodule.c.
> 
> I asked myself this question when writing the patch: what's the
> convention  regarding functions ? Should they always be declared in a
> header with PyAPI_FUNC, or should this be reserved to functions
> exported through the API?

IMO they should always be exposed in header files. It makes them easier
to discover and re-use than with some "extern" decls sprinkled in .c
files. As for PyAPI_FUNC, I think we always use it out of convention,
although it's probably not useful for private API functions.
msg134589 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-04-27 16:04
Here's an updated patch, tested on RHEL4U8.
msg134599 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-27 17:22
New changeset f6feed6ec3f9 by Antoine Pitrou in branch '2.7':
Issue #10517: After fork(), reinitialize the TLS used by the PyGILState_*
http://hg.python.org/cpython/rev/f6feed6ec3f9
msg134604 - (view) Author: Roundup Robot (python-dev) Date: 2011-04-27 17:38
New changeset 7b7ad9a88451 by Antoine Pitrou in branch '3.2':
Issue #10517: After fork(), reinitialize the TLS used by the PyGILState_*
http://hg.python.org/cpython/rev/7b7ad9a88451

New changeset c8f283cd3e6e by Antoine Pitrou in branch 'default':
Issue #10517: After fork(), reinitialize the TLS used by the PyGILState_*
http://hg.python.org/cpython/rev/c8f283cd3e6e
msg134605 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-04-27 17:39
It should be fixed now! Thank you.
msg145160 - (view) Author: Graham Dumpleton (grahamd) Date: 2011-10-08 06:20
Did anyone test this fix for case of fork() being called from Python sub interpreter?

Getting a report of fork() failing in sub interpreters under mod_wsgi that may be caused by this change. Still investigating.

Specifically throwing up error:

  Couldn't create autoTLSkey mapping
msg145164 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-10-08 10:52
Hello,

> Did anyone test this fix for case of fork() being called from Python sub interpreter?
>

Not specifically, unless it's part of the test suite.
Anyway, unless this problem is systematic - which I doubt - it
probably wouldn't have helped.

> Getting a report of fork() failing in sub interpreters under mod_wsgi that may be caused by this change. Still investigating.
>
> Specifically throwing up error:
>
>  Couldn't create autoTLSkey mapping
>

Hmmm.
If you can, try strace or instrument the code (perror() should be
enough) to see why it's failing.
pthread_setspecific() can fail with:
- EINVAL, if the TLS key is invalid (which would be strange since we
call pthread_key_delete()/pthread_key_create() just before)
- or ENOMEM, if you run out of memory/address space

The later seems much more likely (e.g. if many child processes and
subinterpreters are created).
BTW, if this is a bug report from someone else, tell him to post here,
it'll be easier.
And we don't byte :-)
msg145354 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2011-10-11 17:18
I did a quick test (calling fork() from a subinterpreter), and as
expected, I couldn't reproduce the problem.
So I still favor an OOM condition making pthread_setspecific bail out
with ENOMEM, othe other option being a nasty libc bug.
If the problem persists, please open a new issue.
History
Date User Action Args
2013-02-04 19:18:25jceasetnosy: + jcea
2011-10-11 17:18:59neologixsetmessages: + msg145354
2011-10-08 10:52:21neologixsetmessages: + msg145164
2011-10-08 06:20:20grahamdsetnosy: + grahamd
messages: + msg145160
2011-04-27 17:39:09pitrousetstatus: open -> closed
versions: - Python 3.1
messages: + msg134605

resolution: fixed
stage: commit review -> resolved
2011-04-27 17:38:19python-devsetmessages: + msg134604
2011-04-27 17:22:16python-devsetnosy: + python-dev
messages: + msg134599
2011-04-27 17:16:36pitrousetsuperseder: multiprocessing generates a fatal error ->
2011-04-27 17:16:19pitrousetdependencies: - multiprocessing generates a fatal error
superseder: multiprocessing generates a fatal error
stage: commit review
versions: + Python 3.1, Python 2.7, Python 3.3
2011-04-27 16:04:34neologixsetfiles: + tls_reinit.diff

messages: + msg134589
2011-04-27 16:03:29neologixsetfiles: - thread_invalid_key.diff
2011-04-27 16:03:22neologixsetfiles: - tls_reinit.diff
2011-04-27 16:03:19neologixsetfiles: - tls_reinit_bis.diff
2011-04-27 15:46:37pitrousetmessages: + msg134587
2011-04-27 15:42:28neologixsetmessages: + msg134585
2011-04-27 15:35:09pitrousetmessages: + msg134582
2011-04-27 15:30:45neologixsetfiles: + tls_reinit_bis.diff
2011-04-27 15:30:24neologixsetfiles: + tls_reinit.diff

messages: + msg134580
2011-04-27 15:21:07kristjan.jonssonsetmessages: + msg134577
2011-04-27 14:57:55pitrousetmessages: + msg134573
2011-04-27 14:40:52kristjan.jonssonsetmessages: + msg134570
2011-04-26 16:26:18pitrousetmessages: + msg134476
2011-04-26 16:20:08neologixsetmessages: + msg134475
2011-04-26 15:18:06pitrousetmessages: + msg134471
2011-04-26 08:24:21neologixsetmessages: + msg134448
2011-04-25 21:39:32pitrousetmessages: + msg134418
2011-04-25 21:24:05neologixsetnosy: + pitrou
messages: + msg134415
2011-04-15 19:23:32kristjan.jonssonsetmessages: + msg133866
2011-04-15 19:22:14neologixsetmessages: + msg133865
2011-04-15 19:05:02neologixsetfiles: + thread_invalid_key.diff
keywords: + patch
2011-04-15 19:00:40neologixsetfiles: + test_specific.c
2011-04-15 19:00:06neologixsetnosy: + neologix
messages: + msg133864
2011-03-29 10:47:58kristjan.jonssonsetmessages: + msg132477
2011-03-28 22:59:28dmalcolmsetmessages: + msg132444
2011-03-28 21:58:05sandro.tosisetnosy: + sandro.tosi
messages: + msg132428
2011-02-10 21:59:21dmalcolmsetnosy: bquinlan, kristjan.jonsson, jnoller, dmalcolm, ysj.ray, lukasz.langa
messages: + msg128343
2011-02-10 19:03:55dmalcolmsetnosy: bquinlan, kristjan.jonsson, jnoller, dmalcolm, ysj.ray, lukasz.langa
messages: + msg128330
2011-02-10 19:02:44dmalcolmsetnosy: + kristjan.jonsson
2011-02-10 19:01:08dmalcolmsetnosy: bquinlan, jnoller, dmalcolm, ysj.ray, lukasz.langa
messages: + msg128329
2011-02-09 22:15:24dmalcolmsetnosy: bquinlan, jnoller, dmalcolm, ysj.ray, lukasz.langa
title: test_concurrent_futures crashes with "Fatal Python error: Invalid thread state for this thread" -> test_concurrent_futures crashes with "--with-pydebug" on RHEL5 with "Fatal Python error: Invalid thread state for this thread"
2010-12-08 14:01:52ysj.raysetnosy: + ysj.ray
messages: + msg123604
2010-12-05 19:00:39bquinlansetdependencies: + multiprocessing generates a fatal error
messages: + msg123434
2010-11-25 00:13:13dmalcolmsetmessages: + msg122320
2010-11-24 18:48:09dmalcolmsetmessages: + msg122295
2010-11-24 16:59:22lukasz.langasetnosy: + jnoller
2010-11-24 16:54:41dmalcolmsetmessages: + msg122290
2010-11-24 16:47:27dmalcolmsetmessages: + msg122288
2010-11-24 16:28:37dmalcolmsetmessages: + msg122287
2010-11-24 15:13:51dmalcolmsetnosy: + dmalcolm
2010-11-24 00:32:35lukasz.langasetmessages: + msg122255
2010-11-24 00:24:17lukasz.langacreate