Message182830
> > - how do you know the crash really happens because of thread 5?
>
> All other threads are blocked on locks or condition variables, it's
> the only runnable thread.
Hm, you are right.
> > Another question: are threads being started or stopped while the thread local object is being deleted?
>
> >From the stack trace, thread 2 is being stopped.
>
> I guess the problem is similar to above: thread 2 is in the middle of
> stopping, its TLS dict is deallocated, which triggers the thread local
> object deallocation, which releases the GIL. Thread 5 becomes running,
> and must somehow access thread 2 tstate.
I've read the code several times and I find it unlikely that it's the
cause of the problem:
- the thread state's thread-local dict (tstate->dict) is deallocated
using Py_CLEAR(), meaning it's unreachable from other threads when
deallocating one of the values releases the GIL
- the thread-local object's deallocator checks that tstate->dict is
non-NULL before using it; the only thing that could go wrong is if
PyDict_GetItem() releases the GIL, which sounds unlikely on tstate->dict
(also, I've checked that threadmodule.c holds the GIL when inserting and
removing thread states from the interpreter's thread states list; it
would be more future-proof for local_dealloc to use pystate.c's
HEAD_LOCK() and HEAD_UNLOCK() APIs, though)
I'm wondering if there's something else interfering here. My attempts at
writing a stress-test script have failed to produce any crash. |
|
Date |
User |
Action |
Args |
2013-02-23 22:35:38 | pitrou | set | recipients:
+ pitrou, r.david.murray, neologix, Albert.Zeyer |
2013-02-23 22:35:38 | pitrou | link | issue17263 messages |
2013-02-23 22:35:38 | pitrou | create | |
|