classification
Title: _PyThreadState_Init and fork race leads to inconsistent key list
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: Ján Stanček, Thomas Mortensson, cstratak, davin, matrixise, neologix, petr.viktorin, vstinner, zach.ware
Priority: normal Keywords: patch

Created on 2017-02-24 09:03 by Ján Stanček, last changed 2020-04-27 04:19 by zach.ware. This issue is now closed.

Files
File name Uploaded Description Edit
bz1268226_reproducer2.tar.gz Ján Stanček, 2017-02-24 09:03 python patch that makes it easier to reproduce
fork_hold_keymutex.patch petr.viktorin, 2017-03-01 11:14
python.patch cstratak, 2017-03-16 17:13
0001-Protect-key-list-during-fork.patch cstratak, 2017-03-22 15:56
Pull Requests
URL Status Linked Edit
PR 783 closed cstratak, 2017-03-23 13:39
PR 5141 closed petr.viktorin, 2018-01-09 19:44
Messages (18)
msg288510 - (view) Author: Ján Stanček (Ján Stanček) Date: 2017-02-24 09:03
Following crash is sporadically observed in RHEL7 anaconda:

(gdb) f 0
#0  PyThread_ReInitTLS () at /usr/src/debug/Python-2.7.5/Python/thread.c:411
411             if (p->id != id) {
(gdb) list
406         keymutex = PyThread_allocate_lock();
407
408         /* Delete all keys which do not match the current thread id */
409         q = &keyhead;
410         while ((p = *q) != NULL) {
411             if (p->id != id) {
412                 *q = p->next;
413                 free((void *)p);
414                 /* NB This does *not* free p->value! */
415             }
(gdb) p p
$1 = (struct key *) 0x3333333333333333
(gdb) p keyhead
$2 = (struct key *) 0x3333333333333333

key list is protected by keymutex (except for PyThread_ReInitTLS), but there doesn't seem to be any protection against concurrent fork(). What seems to happen is fork() at the moment when key list is not consistent.

For example, if I initialize new key to 0xfe:

static struct key *find_key(int key, void *value)
    // find_key with extra memset()
    ...
    p = (struct key *)malloc(sizeof(struct key));
    memset(p, 0xfe, sizeof(struct key));
    if (p != NULL) {
        p->id = id;
        p->key = key;
        p->value = value;
        p->next = keyhead;
        keyhead = p;
    }
    ...

Looking at disassembly, compiler reordered last 2 writes:

   0x00000000004fcb50 <+272>:   callq  0x413d10 <malloc@plt>
   0x00000000004fcb55 <+277>:   movabs $0xfefefefefefefefe,%rcx
   0x00000000004fcb5f <+287>:   test   %rax,%rax
   0x00000000004fcb62 <+290>:   mov    %rcx,(%rax)
   0x00000000004fcb65 <+293>:   mov    %rcx,0x8(%rax)
   0x00000000004fcb69 <+297>:   mov    %rcx,0x10(%rax)
   0x00000000004fcb6d <+301>:   mov    %rcx,0x18(%rax)
   0x00000000004fcb71 <+305>:   je     0x4fcaff <PyThread_set_key_value+191>
   0x00000000004fcb73 <+307>:   mov    0x2f1e26(%rip),%rdx        # 0x7ee9a0 <keyhead>
   0x00000000004fcb7a <+314>:   mov    0x2f1dff(%rip),%rdi        # 0x7ee980 <keymutex>
   0x00000000004fcb81 <+321>:   xor    %r14d,%r14d
   0x00000000004fcb84 <+324>:   mov    %rbp,0x8(%rax)
   0x00000000004fcb88 <+328>:   mov    %r12d,0x10(%rax)
   0x00000000004fcb8c <+332>:   mov    %r13,0x18(%rax)
   0x00000000004fcb90 <+336>:   mov    %rax,0x2f1e09(%rip)        # 0x7ee9a0 <keyhead>
   0x00000000004fcb97 <+343>:   mov    %rdx,(%rax)

Now consider what happens, when different threads call fork() in between these 2 writes: we updated keyhead, but keyhead->next has not been updated yet.

Now when anaconda hangs, I get:

(gdb) list
407         keymutex = PyThread_allocate_lock();
408
409         /* Delete all keys which do not match the current thread id */
410         q = &keyhead;
411         while ((p = *q) != NULL) {
412             if (p->id != id) {
413                 *q = p->next;
414                 free((void *)p);
415                 /* NB This does *not* free p->value! */
416             }
(gdb) p p
$1 = (struct key *) 0xfefefefefefefefe
(gdb) p keyhead
$2 = (struct key *) 0xfefefefefefefefe


Here's how I think we get into this state:

-------------------------> thread 1
# has GIL
Thread.start
  _start_new_thread(self.__bootstrap, ())
    PyThread_start_new_thread(t_bootstrap)
      # spawns thread 3

-------------------------> thread 2
...
  # waiting for GIL

-------------------------> thread 3
t_bootstrap
  _PyThreadState_Init # does not have GIL yet at this point
    _PyGILState_NoteThreadState
      PyThread_set_key_value(autoTLSkey, (void *)tstate)
        find_key
          # key list is temporarily not consistent
          # due to compiler reordering couple writes in find_key

-------------------------> thread 1
continuing Thread.start
  self.__started.wait()
    Event.wait()
      self.__cond.wait 
        Condition.wait()
          waiter = _allocate_lock()
          waiter.acquire()
            lock_PyThread_acquire_lock
              Py_BEGIN_ALLOW_THREADS
                PyEval_SaveThread
                  PyThread_release_lock(interpreter_lock);

-------------------------> thread 2
...
  # acquired GIL
  os.fork() # forks inconsistent list

-------------------------> child
PyOS_AfterFork()
  PyThread_ReInitTLS()
    SIGSEGV

Attached patch for python makes it easier to reproduce, by adding delays to couple places to make window key list is not consistent larger.
msg288753 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2017-03-01 11:14
Here is a proof of concept patch from Jaroslav Škarvada. It fixes the problem by holding the mutex used for PyThread_create_key while forking.

To make it more than PoC it needs adding _PyThread_AcquireKeyLock and _ReleaseKeyLock (similar to _PyImport_AcquireLock() etc.) and calling those.  Other than that, does this approach look reasonable?
msg289723 - (view) Author: Charalampos Stratakis (cstratak) * Date: 2017-03-16 17:13
In order to reproduce:

Apply the python.patch from bz1268226_reproducer2.tar.gz

Compile python

Run the reproduce4.py from bz1268226_reproducer2.tar.gz

As indicated by the reproducer, the status returned by os.wait() for the child is 139.

I will refine a bit the patch and work on a PR.
msg289993 - (view) Author: Charalampos Stratakis (cstratak) * Date: 2017-03-22 15:56
Patch for protecting the key list while forking.
msg299299 - (view) Author: Thomas Mortensson (Thomas Mortensson) Date: 2017-07-27 09:40
Hey, any status update on this bug? Suffered a similar issue on a Centos 6.5 kernel when spawning multiple processes in a Twisted environment. Is this PR targeted for inclusion into the source tree? 

Thanks,

Tom
msg309700 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2018-01-09 14:58
I'm a bit out of my depth here. Victor, could you chime in?

The problem with Harris' patch is that, once fork() is protected by the thread lock, acquiring that lock (by e.g. calling `PyGILState_GetThisThreadState`) from an `atfork` handler hangs deadlocks.

I thought that can be solved by doing the locking in an atfork handler, but that's not working out -- CPython's pthread_atfork (which would lock _PyThread_AcquireKeyLock for the duration of the fork) would need to be called *after* an extension's pthread_atfork (which needs the thread lock temporarily).
msg309705 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-01-09 15:58
There is a more general issue for any lock and fork(): bpo-6721, "Locks in the standard library should be sanitized on fork".
msg309712 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-01-09 16:21
Python 3 is not affected by this issue because it uses native thread locale storage (TLS):

* pthread: pthread_getspecific() / pthread_setspecific()
* Windows: TlsGetValue() / TlsSetValue()

I'm not sure that it's doable to backport such enhancement, since Python 2.7 supports many thread implementations, not only NT (Windows) and pthread:

* Python/thread_atheos.h
* Python/thread_beos.h
* Python/thread_cthread.h
* Python/thread_lwp.h
* Python/thread_nt.h
* Python/thread_os2.h
* Python/thread_pth.h
* Python/thread_pthread.h
* Python/thread_sgi.h
* Python/thread_solaris.h
* Python/thread_wince.h

Maybe it's doable for a Linux vendor, but it's going to be a large change that has to be maintained downstream :-/
msg309725 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2018-01-09 19:18
Gah! The more I look into locks & forks ... the more I learn, to put it mildly.

Instead of piling on workarounds, I'll try my hand at using native thread-local storage for pthread, and avoid the locking altogether.

Hopefully that can make it in as a bugfix? At least for this bug, it most likely *is* the most appropriate fix -- though I can only fix it for pthread.
msg309731 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2018-01-09 19:45
WIP pull request: https://github.com/python/cpython/pull/5141
(I'll not get back to this for a few days, sadly.)
msg316649 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2018-05-15 12:22
Hi Petr,

Do you continue this patch/issue?
Thank you
msg316655 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2018-05-15 13:31
Not immediately, but it is on my TODO list.
If anyone wants to tackle it in the mean time, I'd be happy to answer any questions
msg317077 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-19 00:20
Oh, it seems like I was wrong in my previous comment. Python 2.7 code base is already designed to support native TLS. It's just that we only implement native TLS on Windows. So yeah, it seems doable to implement native TLS for pthread.

History of Py_HAVE_NATIVE_TLS:

commit 8d98d2cb95ac37147a4de5a119869211e8351324
Author: Mark Hammond <mhammond@skippinet.com.au>
Date:   Sat Apr 19 15:41:53 2003 +0000

    New PyGILState_ API - implements pep 311, from patch 684256.

commit 00f2df495a6fcb40d70243c34f296f26ccc72719
Author: Kristján Valur Jónsson <kristjan@ccpgames.com>
Date:   Fri Jan 9 20:03:27 2009 +0000

    Issue 3582.  Improved thread support and TLS for Windows
msg317184 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2018-05-20 16:28
pthread is not generally compatible with int, so it can't be used with Python 2's API.
This was changed in py3 with PEP 539, which however makes it so that the old TLS API is only available if PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT.
msg317273 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-22 13:02
> the old TLS API is only available if PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT.

PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT is defined on most (pthread) platforms, no? I understood that the PEP 539 is mostly designed for Cygwin, a platform which is not officially supported by Python. At least, PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT is set to 1 on my Fedora 27 (Linux).

I propose to cast pthread_key_create() result to int, but only define PyThread_create_key() in Python/thread_pthread.h if PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT is defined.

It means that the pthread implementation of Python would still have this bug (race condition) if PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT is not defined. But backporting the PEP 539 to Python 2.7 doesn't seem worth it.

What do you think?
msg317276 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2018-05-22 13:18
> PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT is defined on most (pthread) platforms, no?

> I propose to cast pthread_key_create() result to int, but only define PyThread_create_key() in Python/thread_pthread.h if PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT is defined.

I don't think that's a good idea. Changing API, even for platforms that aren't officially supported, sounds very harsh this late in the release cycle.

But! I suppose we could fix the bug only for platforms with PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT. Other platforms would keep the current implementation -- they'd still have the bug, but the API would stay unchanged.
msg317291 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-22 14:02
> But! I suppose we could fix the bug only for platforms with PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT.

Yes, this is my proposal.

>> I propose to cast pthread_key_create() result to int, but only define PyThread_create_key() in Python/thread_pthread.h if PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT is defined.

> I don't think that's a good idea. Changing API, even for platforms that aren't officially supported, sounds very harsh this late in the release cycle.

Which API change? I don't propose to modify the existing public C API "int PyThread_create_key(void)". I only propose to change it's implementation to the native pthread API when PTHREAD_KEY_T_IS_COMPATIBLE_WITH_INT is defined.
msg367389 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2020-04-27 04:19
As 2.7 is now EOL, I'm closing the issue.
History
Date User Action Args
2020-04-27 04:19:17zach.waresetstatus: open -> closed

nosy: + zach.ware
messages: + msg367389

resolution: out of date
stage: patch review -> resolved
2018-05-22 14:02:36vstinnersetmessages: + msg317291
2018-05-22 13:18:02petr.viktorinsetmessages: + msg317276
2018-05-22 13:02:13vstinnersetmessages: + msg317273
2018-05-20 16:28:17petr.viktorinsetmessages: + msg317184
2018-05-19 00:20:50vstinnersetmessages: + msg317077
2018-05-15 13:31:06petr.viktorinsetmessages: + msg316655
2018-05-15 12:22:45matrixisesetnosy: + matrixise
messages: + msg316649
2018-01-09 19:45:42petr.viktorinsetmessages: + msg309731
2018-01-09 19:44:33petr.viktorinsetstage: patch review
pull_requests: + pull_request5000
2018-01-09 19:18:38petr.viktorinsetmessages: + msg309725
2018-01-09 16:21:41vstinnersetmessages: + msg309712
2018-01-09 15:58:32vstinnersetmessages: + msg309705
2018-01-09 14:58:10petr.viktorinsetmessages: + msg309700
2017-07-27 09:40:04Thomas Mortenssonsetnosy: + Thomas Mortensson
messages: + msg299299
2017-07-19 10:03:13vstinnersetnosy: + vstinner
2017-04-05 16:04:29pitrousetnosy: + neologix
2017-04-05 04:54:10rhettingersetnosy: + davin
2017-03-23 13:39:44cstrataksetpull_requests: + pull_request687
2017-03-22 15:56:56cstrataksetfiles: + 0001-Protect-key-list-during-fork.patch

messages: + msg289993
2017-03-16 17:13:56cstrataksetfiles: + python.patch

messages: + msg289723
2017-03-01 11:14:07petr.viktorinsetfiles: + fork_hold_keymutex.patch

nosy: + petr.viktorin
messages: + msg288753

keywords: + patch
2017-02-24 15:56:38cstrataksetnosy: + cstratak
2017-02-24 09:03:58Ján Stančekcreate