Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sqlite3] sqlite3_prepare_v2 micro optimisation: pass string size #88331

Closed
erlend-aasland opened this issue May 18, 2021 · 9 comments
Closed
Labels
3.11 only security fixes extension-modules C modules in the Modules dir performance Performance or resource usage topic-sqlite3

Comments

@erlend-aasland
Copy link
Contributor

BPO 44165
Nosy @berkerpeksag, @serhiy-storchaka, @pablogsal, @erlend-aasland
PRs
  • bpo-44165: optimise sqlite3 statement preparation by passing string size #26206
  • bpo-44165: pysqlite_statement_create() returns a PyObject *, not an int #26484
  • bpo-44165: Add tests for sqlite3 SQL string length #26485
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-05-18.07:23:06.916>
    labels = ['extension-modules', '3.11', 'performance']
    title = '[sqlite3] sqlite3_prepare_v2 micro optimisation: pass string size'
    updated_at = <Date 2021-06-02.13:26:30.592>
    user = 'https://github.com/erlend-aasland'

    bugs.python.org fields:

    activity = <Date 2021-06-02.13:26:30.592>
    actor = 'erlendaasland'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Extension Modules']
    creation = <Date 2021-05-18.07:23:06.916>
    creator = 'erlendaasland'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44165
    keywords = ['patch']
    message_count = 8.0
    messages = ['393856', '393857', '393978', '393980', '394900', '394901', '394904', '394905']
    nosy_count = 4.0
    nosy_names = ['berker.peksag', 'serhiy.storchaka', 'pablogsal', 'erlendaasland']
    pr_nums = ['26206', '26484', '26485']
    priority = 'low'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue44165'
    versions = ['Python 3.11']

    @erlend-aasland
    Copy link
    Contributor Author

    The signature of sqlite3_prepare_v2 is as follows:
    int sqlite3_prepare_v2(
      sqlite3 *db,            /* Database handle */
      const char *zSql,       /* SQL statement, UTF-8 encoded */
      int nByte,              /* Maximum length of zSql in bytes. */
      sqlite3_stmt **ppStmt,  /* OUT: Statement handle */
      const char **pzTail     /* OUT: Pointer to unused portion of zSql */
    );

    Quoting from the SQLite docs[1]:
    "If the caller knows that the supplied string is nul-terminated, then there is a small performance advantage to passing an nByte parameter that is the number of bytes in the input string including the nul-terminator."

    sqlite3_prepare_v2 is used five places in the sqlite3 module. We can easily provide the string size in those places.

    [1] https://sqlite.org/c3ref/prepare.html

    @erlend-aasland erlend-aasland added 3.11 only security fixes extension-modules C modules in the Modules dir performance Performance or resource usage labels May 18, 2021
    @erlend-aasland
    Copy link
    Contributor Author

    Note, PR 26206 does not include statement creation in _pysqlite_connection_begin (Modules/_sqlite/connection.c). That needs further refactoring, so I'll add that in a separate PR if PR 26206 is accepted.

    @erlend-aasland
    Copy link
    Contributor Author

    Regarding the maximum length of an SQL string, quoting from https://sqlite.org/limits.html:
    "The current implementation will only support a string or BLOB length up to 2^31-1 or 2147483647. And some built-in functions such as hex() might fail well before that point. In security-sensitive applications it is best not to try to increase the maximum string and blob length. In fact, you might do well to lower the maximum string and blob length to something more in the range of a few million if that is possible."

    The size returned from functions such as PyUnicode_AsUTF8AndSize is Py_ssize_t. I suggest checking the passed SQL string size and raising OverflowError if the SQL string is larger than SQLITE_MAX_LENGTH.

    @erlend-aasland
    Copy link
    Contributor Author

    SQLITE_TOOBIG is currently mapped to sqlite3.DataError. In order to keep the current behaviour, DataError must be raised.

    @pablogsal
    Copy link
    Member

    New changeset a384b6c by Erlend Egeberg Aasland in branch 'main':
    bpo-44165: Optimise sqlite3 statement preparation by passing string size (GH-26206)
    a384b6c

    @pablogsal
    Copy link
    Member

    Erlend, check out #26206 (comment). After that we can close this issue

    @pablogsal
    Copy link
    Member

    New changeset fbf25b8 by Erlend Egeberg Aasland in branch 'main':
    bpo-44165: pysqlite_statement_create now returns a Py object, not an int (GH-26484)
    fbf25b8

    @erlend-aasland
    Copy link
    Contributor Author

    Erlend, check out #26206 (comment). After that we can close this issue

    See msg393857: I'll add a PR for _pysqlite_connection_begin as well. After that, we can close this.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @erlend-aasland
    Copy link
    Contributor Author

    IMO, the added complexity is not worth it to change the remaining usages. Closing this issue.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes extension-modules C modules in the Modules dir performance Performance or resource usage topic-sqlite3
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants