Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate sqlite3 module to _v2 API to enhance performance #53549

Closed
feuermurmel mannequin opened this issue Jul 19, 2010 · 20 comments
Closed

Migrate sqlite3 module to _v2 API to enhance performance #53549

feuermurmel mannequin opened this issue Jul 19, 2010 · 20 comments
Labels
3.7 (EOL) end of life extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@feuermurmel
Copy link
Mannequin

feuermurmel mannequin commented Jul 19, 2010

BPO 9303
Nosy @pitrou, @vstinner, @larryhastings, @akheron, @berkerpeksag, @dimaqq, @JimJJewett, @serhiy-storchaka, @palaviv
PRs
  • bpo-9303: Migrate sqlite3 module to _v2 API to enhance performance #359
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • Files
  • sqlite.patch
  • 9303.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-03-03.10:59:13.004>
    created_at = <Date 2010-07-19.12:08:04.906>
    labels = ['extension-modules', 'type-feature', '3.7']
    title = 'Migrate sqlite3 module to _v2 API to enhance performance'
    updated_at = <Date 2017-04-09.09:12:02.016>
    user = 'https://bugs.python.org/feuermurmel'

    bugs.python.org fields:

    activity = <Date 2017-04-09.09:12:02.016>
    actor = 'berker.peksag'
    assignee = 'ghaering'
    closed = True
    closed_date = <Date 2017-03-03.10:59:13.004>
    closer = 'berker.peksag'
    components = ['Extension Modules']
    creation = <Date 2010-07-19.12:08:04.906>
    creator = 'feuermurmel'
    dependencies = []
    files = ['25212', '42663']
    hgrepos = []
    issue_num = 9303
    keywords = ['patch']
    message_count = 20.0
    messages = ['110739', '110741', '158169', '158228', '158261', '161183', '180026', '233463', '233468', '248821', '264548', '271826', '288726', '288754', '288769', '288772', '288776', '288877', '290329', '291360']
    nosy_count = 13.0
    nosy_names = ['ghaering', 'pitrou', 'vstinner', 'larry', 'Arfrever', 'feuermurmel', 'petri.lehtinen', 'berker.peksag', 'Dima.Tisnek', 'Jim.Jewett', 'serhiy.storchaka', 'Robin.Schreiber', 'palaviv']
    pr_nums = ['359', '552']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue9303'
    versions = ['Python 3.7']

    @feuermurmel
    Copy link
    Mannequin Author

    feuermurmel mannequin commented Jul 19, 2010

    The Python sqlite module currently uses some deprecated API 0 of SQLite. These are functions that have a counterpart with _v2 appended to their name.

    The SQLite query planner will not use certain optimizations when using the old API. For example, as documented in 1, using a statement with a GLOB or LIKE operator where the pattern is parametrized will only use an appropriate index if sqlite3_prepare_v2() is used instead of sqlite3_prepare().

    Following is an example of such a query. When executed, table 'test' is scanned row by row, which requires all data in the table to be loaded from disk.

    cursor.execute('create table test(a text)')
    cursor.execute('create index test_index on test(a)')
    # insert some data here
    cursor.execute('select * from test where a glob ?', ['prefix*'])

    When the same query is executed in the sqlite3 command line utility, the index 'test_index' is used instead.

    sqlite> create table test(a text);
    sqlite> create index test_index on test(a);
    sqlite> explain query plan select * from test where a glob 'prefix*';
    order from detail
    0 0 TABLE test WITH INDEX test_index

    The query in this example is not parametrized as parameters can't be filled in when using the sqlite3 command line utility. This is just to show that the schema and the query allow the index to be used with certain pattern strings.

    Michael

    @feuermurmel feuermurmel mannequin added performance Performance or resource usage extension-modules C modules in the Modules dir labels Jul 19, 2010
    @ghaering
    Copy link
    Mannequin

    ghaering mannequin commented Jul 19, 2010

    Yes, the sqlite module uses the old API, and is written to work with older SQLite releases and their respective bugs as well.

    Using the new API will mean requiring newer SQLite releases.

    If we do this, then this is the chance to remove all the obscure backwards compatibility code with older SQLite releases.

    @RobinSchreiber
    Copy link
    Mannequin

    RobinSchreiber mannequin commented Apr 12, 2012

    Apparently this issue has not been dealt with for quite some time now. As a prospective GSoC student, I still need to submit a patch to pass final screening and I thought, that the needed patch here would be quite suitable for a beginner. I plan to submit a patch, which simply replaces the deprecated method calls with the new ones.
    Maybe we can also remove some parts of the module code, because of the new semantics of prepare_v2(), however I would first like to hear Gerhards opinion on that :-)

    @jimjjewett
    Copy link
    Mannequin

    jimjjewett mannequin commented Apr 13, 2012

    I can't speak for GSoC or Gerhard, but it strikes me as a reasonable first step. An alternatives woube be writing it with fallbacks (so older sqlite can still be used, though less efficiently). It would also be nice to clean up at least one call with compatibility cruft. That said, don't hesitate to submit a patch that does *something*, and then replace it with more comprehensive patches later.

    @RobinSchreiber
    Copy link
    Mannequin

    RobinSchreiber mannequin commented Apr 14, 2012

    I have now submitted a patch, which swapped out all the necessary calls. Tests are all running as expected. I will now try to remove some backwards compatibility code.

    @akheron
    Copy link
    Member

    akheron commented May 20, 2012

    What Sqlite version are you targeting? Some systems use quite old Sqlite versions, see bpo-14572 for example.

    @pitrou
    Copy link
    Member

    pitrou commented Jan 15, 2013

    This would be nice to have in 3.4. When were the _v2 APIs introduced?

    See also bpo-13773 which uses sqlite3_open_v2() to pass some flags.

    @dimaqq
    Copy link
    Mannequin

    dimaqq mannequin commented Jan 5, 2015

    Is there any hope?
    Surely sqlite backwards compatibility is not an issue any longer...

    @ghaering
    Copy link
    Mannequin

    ghaering mannequin commented Jan 5, 2015

    ok, i will have to look into this

    @ghaering
    Copy link
    Mannequin

    ghaering mannequin commented Aug 19, 2015

    The externally maintained version of the sqlite3 module has now been switched to the v2 statement API. pysqlite is for Python 2.7 only. I'd suggest to revisit this for Python 3.6 and then try to port most fixes from pysqlite to the sqlite3 module.

    @berkerpeksag berkerpeksag added type-feature A feature request or enhancement and removed performance Performance or resource usage labels Mar 27, 2016
    @palaviv
    Copy link
    Mannequin

    palaviv mannequin commented Apr 30, 2016

    I made a new patch to fix this issue. I left a fallback to the old API as Jim suggested. In addition to the changes in Robin`s patch I changed sqlite3_close to sqlite3_close_v2 if available.

    This solves bpo-26187 as well.

    @dimaqq
    Copy link
    Mannequin

    dimaqq mannequin commented Aug 2, 2016

    Can someone review Aviv's patch?

    @palaviv
    Copy link
    Mannequin

    palaviv mannequin commented Feb 28, 2017

    I opened a PR. This actually is a bugfix in addition to an enhancement as it solves bpo-26187 as well.

    @serhiy-storchaka
    Copy link
    Member

    Is detection in the configure script needed? What are SQLite versions in which _v2 APIs were added? What is minimal supported SQLite version?

    Don't forget about Windows where autotools are not used.

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Mar 1, 2017
    @palaviv
    Copy link
    Mannequin

    palaviv mannequin commented Mar 1, 2017

    from https://www.sqlite.org/changes.html:
    sqlite3_open_v2 - 2007-09-04 (3.5.0) alpha
    sqlite3_prepare_v2 - 2007-01-04 (3.3.9)
    sqlite3_close_v2 - 2012-09-03 (3.7.14)

    In bpo-29355 Ma Lin says that RHEL6 comes with SQLite 3.6.x. I think that removing that removing the check for sqlite3_prepare_v2 and sqlite3_open_v2 is fine but sqlite3_close_v2 need to stay.

    Don't forget about Windows where autotools are not used.

    What should I do for this to work on windows?

    @serhiy-storchaka
    Copy link
    Member

    sqlite3_open_v2 - 2007-09-04 (3.5.0) alpha
    sqlite3_prepare_v2 - 2007-01-04 (3.3.9)
    sqlite3_close_v2 - 2012-09-03 (3.7.14)

    There are compile-time checks for supporting SQLite older than 3.2.2.

    I think it is better to use the preprocessor rather of autotools for
    conformance with the rest of code.

    What should I do for this to work on windows?

    On Windows manually written PC/pyconfig.h is used rather of generated by
    configure pyconfig.h. If you change pyconfig.h, you should update PC/pyconfig.h
    too. But I think this is not needed in this case. Just use
    SQLITE_VERSION_NUMBER.

    @palaviv
    Copy link
    Mannequin

    palaviv mannequin commented Mar 1, 2017

    I changed the patch to use SQLITE_VERSION_NUMBER and it looks way better. Thanks Serhiy

    @berkerpeksag
    Copy link
    Member

    I will add commit information manually: 86a6705

    Thanks!

    @berkerpeksag
    Copy link
    Member

    New changeset 86a6705 by Berker Peksag (Aviv Palivoda) in branch 'master':
    bpo-9303: Migrate sqlite3 module to _v2 API to enhance performance (#359)
    86a6705

    @berkerpeksag
    Copy link
    Member

    New changeset 0e6cb2e by Berker Peksag (Aviv Palivoda) in branch 'master':
    bpo-26187: Test that set_trace_callback() is not called multiple times (GH-461)
    0e6cb2e

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life extension-modules C modules in the Modules dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants