classification
Title: Convert sqlite3 to multi-phase initialisation (PEP 489)
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, corona10, erlendaasland, miss-islington, petr.viktorin, vstinner
Priority: normal Keywords: patch

Created on 2020-10-17 19:37 by erlendaasland, last changed 2021-07-29 19:03 by erlendaasland.

Pull Requests
URL Status Linked Edit
PR 26537 merged erlendaasland, 2021-06-04 21:58
PR 26745 merged erlendaasland, 2021-06-15 19:37
PR 26840 merged erlendaasland, 2021-06-21 22:34
PR 26884 merged erlendaasland, 2021-06-23 20:11
PR 27155 merged erlendaasland, 2021-07-14 23:06
PR 27156 merged erlendaasland, 2021-07-15 00:02
PR 27273 merged erlendaasland, 2021-07-20 22:03
PR 27456 open erlendaasland, 2021-07-29 19:03
Messages (17)
msg378825 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2020-10-17 19:37
Porting sqlite3 to multi-phase initialisation is non-trivial and must be broken into smaller bpo's, often with multiple PR's per bpo. As far as I can see, this is the task list:

1) Heap types (PEP 384): bpo-41861
2) Argument Clinic: bpo-40956
3) Establish global module state
4) Pass `self` iso. callback object to trace, auth, and progress handlers (required for item 7)
5) Use sqlite3_create_collation_v2 iso. sqlite3_create_collation (see next list item)
6) For sqlite3_create_* functions, establish a context struct that contains the target user callback and the state. Modify callers to allocate memory, and the destructor to deallocate
7) Module state (PEP 573)
8) Multi-phase initialisation (PEP 489)

The list may be incomplete.

Separate bpo's for 1) and 2) are opened. Pr. 2020-10-17, 1) is almost done, and 2), part 1 of 5 is awaiting review. It may be convenient to open up sub-bpo's for some of the other tasks as well. For instance, using sqlite3_create_collation_v2() iso. sqlite3_create_collation(). (FYI, this imposes no new SQLite version requirements.)

I'd wait until AC is implemented with moving forward with the rest of the PR's, in order to avoid merge mess. The exception might be item 4, preparing the trace/progress/auth callbacks, which is easy to isolate from the rest.

I've prepared branches for most of these things locally, some split in multiple PR's to ease reviewing.

I've tagged Victor Stinner and Dong-hee Na (hope that's ok), since they've been helpful with reviewing and providing helpful hints thus far. Any input from Berker Peksag would be highly appreciated; after all, he is the maintainer of this module :)

Ref. bpo-1635741 (the grand multi-phase issue)
msg378829 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2020-10-17 20:30
Instead of using a custom context struct for sqlite3_create_funcion_v2(), we may just use sqlite3_context_db_handle() to get `self`, and then fetch the state using PyType_GetModuleState(Py_TYPE(self)).
msg378832 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2020-10-17 20:46
No, unless we can add `self` as a private context to a sqlite3 database handle, we can't use that shortcut. However, using a custom context struct works, and it's a minor code change.

Proof-of-concept implementation diffstat:
 2 files changed, 54 insertions(+), 10 deletions(-)
msg378905 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2020-10-18 21:51
FYI, WIP branch for this bpo is here: https://github.com/erlend-aasland/cpython/commits/bpo-42064/all
This currently includes items 1 though 6. I'll update the WIP branch with a draft of items 7 and 8, hopefully tomorrow.
msg379029 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2020-10-19 21:50
Work with item 7 is progressing; see WIP branch for details. Current showstopper is how to fetch module state from Connection.__init__ when it has been overloaded (for example in /Lib/sqlite3/test/factory.py). Guessing that Py_TYPE(self) does not return the sqlite3.Connection type.
msg379172 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2020-10-20 22:59
Connection.__init__() fixed by iterating though tp_bases in order to find the correct base class to load the state from. There should be a better way to do this than to dig through type member internals.
msg379326 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2020-10-22 19:35
Sorted out item 7 (module state). Item 8 is mostly in place, although I get ref. count errors if I add traverse/clear callbacks, so I've temporarily disabled those.

Proof-of-concept of multi-phase init can be found here: https://github.com/erlend-aasland/cpython/commits/bpo-42064/all
I'll try to break items 3 and 7 up in smaller pieces for the reviewer's convenience.
msg379476 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2020-10-23 21:13
I've narrowed down the traverse problems to Py_VISIT(state->CacheType). Suspecting this has to do with the cyclic references in cache/connection. Preliminary tests with GC tracking `cache->factory` look promising, but this is a part of the C API that's _completely_ unknown to me (yet), so I'm on thin ice here.
msg380299 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2020-11-03 20:19
FYI, rebased https://github.com/erlend-aasland/cpython/tree/bpo-42064/all onto master, added Petr Viktorin's _PyType_GetModuleByDef() for use in item 7 (module state). I still run into problems in item 8, but I haven't devoted much time for this yet:
Modules/gcmodule.c:116: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small
Enable tracemalloc to get the memory block allocation traceback

object address  : 0x7f861140c6b0
object refcount : 2
object type     : 0x109666678
object type name: type
object repr     : <class 'sqlite3.InterfaceError'>

Fatal Python error: _PyObject_AssertFailed: _PyObject_AssertFailed
Python runtime state: finalizing (tstate=0x7f8631409000)
msg395047 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-06-03 20:29
Global module state has been established by f461a7fc3f8740b9e79e8874175115a3474e5930 (bpo-42862, GH-24203). We can safely migrate static variables into that struct as a next step.
msg395876 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-06-15 12:47
New changeset 10a5c806d4dec6c342dcc9888fbe4fa1fa9b7a1f by Erlend Egeberg Aasland in branch 'main':
bpo-42064: Move sqlite3 types to global state (GH-26537)
https://github.com/python/cpython/commit/10a5c806d4dec6c342dcc9888fbe4fa1fa9b7a1f
msg396411 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2021-06-23 12:07
New changeset 019ad62afd20e80c74f879aa716e339b992a0bb9 by Erlend Egeberg Aasland in branch 'main':
bpo-42064: Remove stale extern declarations in `sqlite3` headers (GH-26840)
https://github.com/python/cpython/commit/019ad62afd20e80c74f879aa716e339b992a0bb9
msg396416 - (view) Author: miss-islington (miss-islington) Date: 2021-06-23 12:57
New changeset a50e28377bcf37121b55c2de70d95a5386c478f8 by Erlend Egeberg Aasland in branch 'main':
bpo-42064: Move `sqlite3` exceptions to global state, part 1 of 2 (GH-26745)
https://github.com/python/cpython/commit/a50e28377bcf37121b55c2de70d95a5386c478f8
msg397477 - (view) Author: miss-islington (miss-islington) Date: 2021-07-14 11:26
New changeset 05162993fe62e7fa3acebdd0062586b9bf63d46a by Erlend Egeberg Aasland in branch 'main':
bpo-42064: Move `sqlite3` exceptions to global state, part 2 of 2 (GH-26884)
https://github.com/python/cpython/commit/05162993fe62e7fa3acebdd0062586b9bf63d46a
msg397866 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-07-20 10:59
New changeset 4c0deb25ac899fbe4da626ce3cb21f204cdd3aa9 by Erlend Egeberg Aasland in branch 'main':
bpo-42064: Finalise establishing sqlite3 global state (GH-27155)
https://github.com/python/cpython/commit/4c0deb25ac899fbe4da626ce3cb21f204cdd3aa9
msg398298 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-07-27 13:54
New changeset 890e22957d427ee994b85d62dfe4d5a7cbd34ec5 by Erlend Egeberg Aasland in branch 'main':
bpo-42064: Migrate to `sqlite3_create_collation_v2` (GH-27156)
https://github.com/python/cpython/commit/890e22957d427ee994b85d62dfe4d5a7cbd34ec5
msg398470 - (view) Author: miss-islington (miss-islington) Date: 2021-07-29 09:21
New changeset d542742128b634264d5b6796297613975211b43b by Erlend Egeberg Aasland in branch 'main':
bpo-42064: Optimise `sqlite3` state access, part 1 (GH-27273)
https://github.com/python/cpython/commit/d542742128b634264d5b6796297613975211b43b
History
Date User Action Args
2021-07-29 19:03:58erlendaaslandsetpull_requests: + pull_request25982
2021-07-29 09:21:59miss-islingtonsetmessages: + msg398470
2021-07-27 13:54:34petr.viktorinsetmessages: + msg398298
2021-07-20 22:03:33erlendaaslandsetpull_requests: + pull_request25818
2021-07-20 10:59:48petr.viktorinsetmessages: + msg397866
2021-07-15 00:02:28erlendaaslandsetpull_requests: + pull_request25693
2021-07-14 23:06:15erlendaaslandsetpull_requests: + pull_request25692
2021-07-14 11:26:51miss-islingtonsetmessages: + msg397477
2021-06-23 20:11:41erlendaaslandsetpull_requests: + pull_request25462
2021-06-23 12:57:00miss-islingtonsetnosy: + miss-islington
messages: + msg396416
2021-06-23 12:07:01corona10setmessages: + msg396411
2021-06-21 22:34:11erlendaaslandsetpull_requests: + pull_request25421
2021-06-15 19:37:12erlendaaslandsetpull_requests: + pull_request25330
2021-06-15 12:47:41petr.viktorinsetnosy: + petr.viktorin
messages: + msg395876
2021-06-04 21:58:59erlendaaslandsetkeywords: + patch
stage: patch review
pull_requests: + pull_request25128
2021-06-03 20:29:27erlendaaslandsetmessages: + msg395047
2020-11-03 20:19:57erlendaaslandsetmessages: + msg380299
2020-10-23 21:13:18erlendaaslandsetmessages: + msg379476
2020-10-22 19:35:03erlendaaslandsetmessages: + msg379326
2020-10-20 22:59:25erlendaaslandsetmessages: + msg379172
2020-10-19 21:50:53erlendaaslandsetmessages: + msg379029
2020-10-18 21:51:01erlendaaslandsetmessages: + msg378905
2020-10-17 20:46:09erlendaaslandsetmessages: + msg378832
2020-10-17 20:30:00erlendaaslandsetmessages: + msg378829
2020-10-17 19:37:05erlendaaslandcreate