This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Fatal Python Error in sqlite3 Python 3.10
Type: crash Stage:
Components: Extension Modules, Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: erlendaasland, hydroflask, pablogsal, vstinner
Priority: normal Keywords:

Created on 2022-03-15 00:18 by hydroflask, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
sqlite3_fatal_python_error.py hydroflask, 2022-03-15 00:18
Messages (16)
msg415212 - (view) Author: (hydroflask) Date: 2022-03-15 00:18
_destructor in connection.c in Python 3.10+ now calls `PyGILState_Ensure()`, this is a problem because if the destructor is being called while the thread is being torn down it will cause an unbalanced/erroneous call to "PyEval_RestoreThread" in PyGILState_Ensure which will eventually trigger a Fatal Python Error. A perfect repro has been attached, should be run on Linux.

My recommended fix is to call sqlite3_close() within Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS, and manually Py_DECREF all connection-related functions afterward.
msg415216 - (view) Author: (hydroflask) Date: 2022-03-15 01:37
If you connect to the sqlite3 database in the example without using the factory, it will simply segfault.
msg415227 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-03-15 08:58
I'm removing 3.11 from the versions field:


    $ python3.11 sqlite3_fatal_python_error.py
    Exception ignored in: <function deleting_conn.__del__ at 0x10af55580>
    Traceback (most recent call last):
      File "/Users/erlendaasland/src/cpython-build/sqlite3_fatal_python_error.py", line 10, in __del__
        self.close()
        ^^^^^^^^^^^^
    sqlite3.ProgrammingError: SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 123145443913728 and this is thread id 4506211840.
msg415230 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-03-15 10:08
> A perfect repro has been attached, should be run on Linux.

FTR, the repo works on macOS 12.2.1 as well:

    $ python3.10 ./sqlite3_fatal_python_error.py
    Fatal Python error: PyThreadState_Get: the function must be called with the GIL held, but the GIL is released (the current Python thread state is NULL)
    Python runtime state: initialized

    Thread 0x00000001126e7600 (most recent call first):
      File "/Users/erlendaasland/src/cpython-build/./sqlite3_fatal_python_error.py", line 52 in <module>
    zsh: abort      python3.10 ./sqlite3_fatal_python_error.py
msg415231 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-03-15 10:12
> My recommended fix is to call sqlite3_close() within Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS, and manually Py_DECREF all connection-related functions afterward.

3.11 already drops the GIL before sqlite3_close_v2(); backporting this change to 3.10 results in the following assertion error:

    $ ./python.exe sqlite3_fatal_python_error.py  
    Assertion failed: (!_PyMem_IsPtrFreed(tstate->interp)), function is_tstate_valid, file ceval.c, line 155.
msg415277 - (view) Author: (hydroflask) Date: 2022-03-15 19:10
If you connect with check_same_thread=False, I believe the issue may still present itself on 3.11+
msg415278 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-03-15 19:12
> If you connect with check_same_thread=False, I believe the issue may still present itself on 3.11+

Nope, works swell on my Mac. Do you see this on Linux with latest 3.11?
msg415279 - (view) Author: (hydroflask) Date: 2022-03-15 19:25
I don't see it immediately but I think it's still possible to happen since all the same offending code is in place. There are two reasosn why it probably doesn't happen in 3.11+:

1) because something is deferring calling the finalizer for the zero-ref object to another thread instead on immediately on the thread that is dying as in <=3.10

2) In 3.11+ it is fixed so that PyGILState_Ensure can be called in a sequence started by PyGILState_Release. @vstinner is this correct?


If the reason it doesn't happen in 3.11+ is because of 1) then I don't think that is specified to happen anywhere and changing the GC in future versions could theoretically trigger the same bug. If it is 2) then it is fixed in a more robust location.

IMO, the most robust fix is to destroy the function callbacks in connection_close() and avoid using the destructor_callback and calling PyGILState_Ensure() altogether.
msg415280 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-03-15 19:29
All callbacks triggered from external libraries must protect Python C API calls. You may call it offending, but that is the reality; a callback may trigger at any point in time, so we need to make sure the GIL is held before calling any Py API. That is also the case for the destructor.
msg415282 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-03-15 19:32
See also bpo-44304
msg415283 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-03-15 19:34
See also Petrs msg400205:

> [...] I recommend that any function passed to SQLite (and only those) should
>  - be named `*_callback`, for clarity
>  - acquire the GIL at the very start
>  - release the GIL at the very end
msg415284 - (view) Author: (hydroflask) Date: 2022-03-15 19:36
> All callbacks triggered from external libraries must protect Python C API calls. You may call it offending, but that is the reality; a callback may trigger at any point in time, so we need to make sure the GIL is held before calling any Py API. That is also the case for the destructor.

Sure but I'm suggesting sidestepping and not using the destructor_callback entirely. You can DECREF the callbacks manually in connection_close() after calling sqlite3_close_v2(). Doing this makes sense to me since the destructor may be called under special conditions.
msg415285 - (view) Author: (hydroflask) Date: 2022-03-15 19:43
If PyGILState_Ensure() has been fixed to become re-entrant during PyGILState_Release() in 3.11+ then I believe that change should be backported to 3.10. @vstinner would know
msg415287 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-03-15 19:49
FYI, PyGILState_Ensure may be called by a thread as many times as you want, and it has been so for a very long time.
msg415288 - (view) Author: (hydroflask) Date: 2022-03-15 19:54
So what is causing this bug in 3.10 is the following stack:

thread_entry_point()
  PyGILState_Ensure()
  PyGILState_Release()
    ...
      PyGILState_Ensure()

The core issue is that PyGILState_Ensure is getting called while the tstate itself is in the process of being destroyed. This causes an invalid state and eventually results in a segfault or "Fatal Python Error."

I think the most robust fix is to allow re-entrancy to PyGILState_Release/PyGILState_Ensure. If that is prohibitively complex and/or is not specified to be allowed, I believe the most robust fix is to avoid using sqlite3 destructor callbacks to DECREF Python objects.
msg415294 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-03-15 20:42
Maybe related: bpo-20891
History
Date User Action Args
2022-04-11 14:59:57adminsetgithub: 91175
2022-03-15 20:42:59erlendaaslandsetmessages: + msg415294
2022-03-15 19:54:57hydroflasksetmessages: + msg415288
2022-03-15 19:49:20erlendaaslandsetmessages: + msg415287
2022-03-15 19:43:31hydroflasksetmessages: + msg415285
2022-03-15 19:36:21hydroflasksetmessages: + msg415284
2022-03-15 19:34:36erlendaaslandsetmessages: + msg415283
2022-03-15 19:32:13erlendaaslandsetnosy: + pablogsal
messages: + msg415282
2022-03-15 19:29:14erlendaaslandsetmessages: + msg415280
2022-03-15 19:25:46hydroflasksetnosy: + vstinner
messages: + msg415279
2022-03-15 19:12:24erlendaaslandsetmessages: + msg415278
2022-03-15 19:10:34hydroflasksetmessages: + msg415277
2022-03-15 10:12:30erlendaaslandsetmessages: + msg415231
2022-03-15 10:08:40erlendaaslandsetmessages: + msg415230
2022-03-15 08:58:19erlendaaslandsetmessages: + msg415227
versions: - Python 3.11
2022-03-15 01:37:46hydroflasksetmessages: + msg415216
2022-03-15 00:18:22hydroflaskcreate