Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap sqlite3_serialize API in sqlite3 module #86096

Closed
KerrickStaley mannequin opened this issue Oct 4, 2020 · 8 comments
Closed

Wrap sqlite3_serialize API in sqlite3 module #86096

KerrickStaley mannequin opened this issue Oct 4, 2020 · 8 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@KerrickStaley
Copy link
Mannequin

KerrickStaley mannequin commented Oct 4, 2020

BPO 41930
Nosy @JelleZijlstra, @erlend-aasland
PRs
  • bpo-41930: Add support for SQLite serialise/deserialise API #26728
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-04-05.14:16:10.053>
    created_at = <Date 2020-10-04.12:05:11.006>
    labels = ['type-feature', 'library']
    title = 'Wrap sqlite3_serialize API in sqlite3 module'
    updated_at = <Date 2022-04-05.14:16:10.052>
    user = 'https://bugs.python.org/KerrickStaley'

    bugs.python.org fields:

    activity = <Date 2022-04-05.14:16:10.052>
    actor = 'JelleZijlstra'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-04-05.14:16:10.053>
    closer = 'JelleZijlstra'
    components = ['Library (Lib)']
    creation = <Date 2020-10-04.12:05:11.006>
    creator = 'Kerrick Staley'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 41930
    keywords = ['patch']
    message_count = 8.0
    messages = ['377935', '393300', '393331', '394181', '395853', '399480', '416780', '416781']
    nosy_count = 3.0
    nosy_names = ['JelleZijlstra', 'erlendaasland', 'Kerrick Staley']
    pr_nums = ['26728']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue41930'
    versions = []

    @KerrickStaley
    Copy link
    Mannequin Author

    KerrickStaley mannequin commented Oct 4, 2020

    It would be useful to provide a wrapper (in the Python sqlite3 stdlib module) for the sqlite3_serialize API. This API allows you to get a database's content as a byte string, as if you had called open('foo.sqlite3', 'rb').read(). sqlite3_serialize can even be used for :memory: databases, which I think is the most interesting use-case.

    Concretely, I'd propose adding a .serialize() method on sqlite3.Connection objects that returns a bytes object representing the serialized database. It would be similar to the .iterdump() method except that it would use the binary format instead of the text format, and it would return all the data at once instead of an iterator (because that's how the underlying sqlite API works).

    @KerrickStaley KerrickStaley mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Oct 4, 2020
    @erlend-aasland
    Copy link
    Contributor

    What would be the use case for this?

    @erlend-aasland
    Copy link
    Contributor

    FYI, this was just checked into the SQLite fossil repo:

    $ fossil update trunk
    updated-to:   6df3b03e00b1143be8fed3a39a58ce8106302027 2021-05-08 17:18:23 UTC
    tags:         trunk
    comment:      Enable the sqlite3_serialize() and sqlite3_deserialize() interfaces by default. Omit the SQLITE_ENABLE_DESERIALIZE option and replace it with the SQLITE_OMIT_DESERIALIZE option. (user: drh)

    Ref. the SQLite docs: https://www.sqlite.org/compile.html#enable_deserialize
    "Future releases of SQLite might enable those interfaces by default and instead offer an SQLITE_OMIT_DESERIALIZE option to leave them out."

    @erlend-aasland
    Copy link
    Contributor

    FYI, the pending release log of the upcoming SQLite 3.36.0 now mentions that these API's are enabled by default:

    https://sqlite.org/releaselog/3_36_0.html

    @erlend-aasland
    Copy link
    Contributor

    What would be the use case for this?

    Quoting D. Richard Hipp, from the sqlite-users mailing list, 2018-03-18, FWIW:
    These APIs support the concept of using small databases (small enough
    to fit in memory) as a container for passing information around.

    For example: A client program might communicate with a server using
    an HTTP request and sending an SQLite database as the POST data. Or
    the server might send an SQLite database as the reply. [...]

    Advantages of using an SQLite database as a container:

    (1) It is easy to mix text and binary data without having to worry
    with encodings.

    (2) Applications can be easily enhanced and extended in
    backwards-compatible ways by adding new columns and/or tables.

    (3) Easy to manually view or modify the container content during
    testing and debugging. (JSON also has this property, but JSON does not
    work with binary data.)

    (4) No need to write encoder/decoder logic.

    (5) There is a high-level query language available to extract the
    content in an order that might be very different from the order it
    appears in the file.

    (6) No need to worry with big-endian vs. little-endian translation,
    nor UTF8 vs UTF16. The database handles that automatically.

    Using the serialise/deserialise API's to pickle an in-memory database to a data stream is an interesting thought.

    @erlend-aasland
    Copy link
    Contributor

    I've been fiddling with this between others projects lately; the PR is mostly ready. The only remaining issue is how to include this in the Connection object:

    • The serialize API was added as a compile-time option (SQLITE_ENABLE_DESERIALIZE) in SQLite 3.23.0
    • The serialize API was included by default from SQLite 3.36.0, but can be disabled using the SQLITE_OMIT_DESERIALIZE compile-time option.

    Solution 1: enable the Python sqlite3 serialize API if SQLITE_VERSION_NUMBER >= 3023000 and force people to build their SQLite library _with_ SQLITE_ENABLE_DESERIALIZE defined for SQLite versions 3.23.0 through 3.35.x and _without_ SQLITE_OMIT_DESERIALIZE defined for SQLite versions 3.36.0 and onward.

    Solution 2: enable the Python sqlite3 serialize API if SQLITE_VERSION_NUMBER >= 3036000 and force people to build their SQLite library _without_ SQLITE_OMIT_DESERIALIZE defined.

    Solution 3: build the Python sqlite3 serialize API as a "sub module" to _sqlite3 (for example _sqlite3._serialize) and add conditionally add it to sqlite3.Connection in Lib/sqlite3/init.py or Lib/sqlite3/dbapi2.py.

    Solution 4: try to autodetect SQLite compile-time options in setup.py, autogenerate a sqlite-config.h header file, and conditionally enable the API based on that.

    I suggest solution 2, as it adds little code (low maintenance), and will require no or minimal changes to the devguide, because (wild guess) in most cases SQLITE_OMIT_DESERIALIZE will not be a widely used compile-time option.

    (For the python.org macOS and Windows installers, this will not be a problem, because we control the SQLite compile-time options.)

    @JelleZijlstra
    Copy link
    Member

    New changeset a755124 by Erlend Egeberg Aasland in branch 'main':
    bpo-41930: Add support for SQLite serialise/deserialise API (GH-26728)
    a755124

    @JelleZijlstra
    Copy link
    Member

    Thanks, this will be in Python 3.11.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants