classification
Title: Wrap sqlite3_serialize API in sqlite3 module
Type: enhancement Stage: patch review
Components: Library (Lib) Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Kerrick Staley, erlendaasland
Priority: normal Keywords: patch

Created on 2020-10-04 12:05 by Kerrick Staley, last changed 2021-08-12 20:02 by erlendaasland.

Pull Requests
URL Status Linked Edit
PR 26728 open erlendaasland, 2021-06-14 21:53
Messages (6)
msg377935 - (view) Author: Kerrick Staley (Kerrick Staley) Date: 2020-10-04 12:05
It would be useful to provide a wrapper (in the Python sqlite3 stdlib module) for the sqlite3_serialize API. This API allows you to get a database's content as a byte string, as if you had called open('foo.sqlite3', 'rb').read(). sqlite3_serialize can even be used for :memory: databases, which I think is the most interesting use-case.

Concretely, I'd propose adding a .serialize() method on sqlite3.Connection objects that returns a bytes object representing the serialized database. It would be similar to the .iterdump() method except that it would use the binary format instead of the text format, and it would return all the data at once instead of an iterator (because that's how the underlying sqlite API works).
msg393300 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-09 06:46
What would be the use case for this?
msg393331 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-09 18:19
FYI, this was just checked into the SQLite fossil repo:

$ fossil update trunk
updated-to:   6df3b03e00b1143be8fed3a39a58ce8106302027 2021-05-08 17:18:23 UTC
tags:         trunk
comment:      Enable the sqlite3_serialize() and sqlite3_deserialize() interfaces by default. Omit the SQLITE_ENABLE_DESERIALIZE option and replace it with the SQLITE_OMIT_DESERIALIZE option. (user: drh)


Ref. the SQLite docs: https://www.sqlite.org/compile.html#enable_deserialize
"Future releases of SQLite might enable those interfaces by default and instead offer an SQLITE_OMIT_DESERIALIZE option to leave them out."
msg394181 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-22 09:25
FYI, the pending release log of the upcoming SQLite 3.36.0 now mentions that these API's are enabled by default:

https://sqlite.org/releaselog/3_36_0.html
msg395853 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-06-14 22:05
> What would be the use case for this?

Quoting D. Richard Hipp, from the sqlite-users mailing list, 2018-03-18, FWIW:
  These APIs support the concept of using small databases (small enough 
  to fit in memory) as a container for passing information around. 

  For example:  A client program might communicate with a server using 
  an HTTP request and sending an SQLite database as the POST data.  Or 
  the server might send an SQLite database as the reply.  [...]

  Advantages of using an SQLite database as a container: 

  (1) It is easy to mix text and binary data without having to worry 
  with encodings. 

  (2) Applications can be easily enhanced and extended in 
  backwards-compatible ways by adding new columns and/or tables. 

  (3) Easy to manually view or modify the container content during 
  testing and debugging. (JSON also has this property, but JSON does not 
  work with binary data.) 

  (4) No need to write encoder/decoder logic. 

  (5) There is a high-level query language available to extract the 
  content in an order that might be very different from the order it 
  appears in the file. 

  (6) No need to worry with big-endian vs. little-endian translation, 
  nor UTF8 vs UTF16.  The database handles that automatically.


Using the serialise/deserialise API's to pickle an in-memory database to a data stream is an interesting thought.
msg399480 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-08-12 20:02
I've been fiddling with this between others projects lately; the PR is mostly ready. The only remaining issue is how to include this in the Connection object:
- The serialize API was added as a compile-time option (SQLITE_ENABLE_DESERIALIZE) in SQLite 3.23.0
- The serialize API was included by default from SQLite 3.36.0, but can be disabled using the SQLITE_OMIT_DESERIALIZE compile-time option.

Solution 1: enable the Python sqlite3 serialize API if SQLITE_VERSION_NUMBER >= 3023000 and force people to build their SQLite library _with_ SQLITE_ENABLE_DESERIALIZE defined for SQLite versions 3.23.0 through 3.35.x and _without_ SQLITE_OMIT_DESERIALIZE defined for SQLite versions 3.36.0 and onward.

Solution 2: enable the Python sqlite3 serialize API if SQLITE_VERSION_NUMBER >= 3036000 and force people to build their SQLite library _without_ SQLITE_OMIT_DESERIALIZE defined.

Solution 3: build the Python sqlite3 serialize API as a "sub module" to _sqlite3 (for example _sqlite3._serialize) and add conditionally add it to sqlite3.Connection in Lib/sqlite3/__init__.py or Lib/sqlite3/dbapi2.py.

Solution 4: try to autodetect SQLite compile-time options in setup.py, autogenerate a sqlite-config.h header file, and conditionally enable the API based on that.

I suggest solution 2, as it adds little code (low maintenance), and will require no or minimal changes to the devguide, because (wild guess) in most cases SQLITE_OMIT_DESERIALIZE will not be a widely used compile-time option.

(For the python.org macOS and Windows installers, this will not be a problem, because we control the SQLite compile-time options.)
History
Date User Action Args
2021-08-12 20:02:31erlendaaslandsetmessages: + msg399480
2021-06-14 22:05:01erlendaaslandsetmessages: + msg395853
2021-06-14 21:53:41erlendaaslandsetkeywords: + patch
stage: patch review
pull_requests: + pull_request25318
2021-05-22 09:25:15erlendaaslandsetmessages: + msg394181
2021-05-09 18:19:22erlendaaslandsetmessages: + msg393331
2021-05-09 06:46:40erlendaaslandsetnosy: + erlendaasland
messages: + msg393300
2020-10-04 12:05:11Kerrick Staleycreate