Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference-counting problem in sqlite #81528

Closed
gescheit mannequin opened this issue Jun 20, 2019 · 9 comments
Closed

Reference-counting problem in sqlite #81528

gescheit mannequin opened this issue Jun 20, 2019 · 9 comments
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@gescheit
Copy link
Mannequin

gescheit mannequin commented Jun 20, 2019

BPO 37347
Nosy @benjaminp, @berkerpeksag, @gescheit, @miss-islington
PRs
  • bpo-37347: fix refcount problem in sqlite3 #14268
  • [3.8] closes bpo-37347: Fix refcount problem in sqlite3. (GH-14268) #14725
  • bpo-37347: Require SQLite 3.7.3+ #17413
  • Files
  • bug.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-07-13.03:16:12.363>
    created_at = <Date 2019-06-20.11:14:59.579>
    labels = ['extension-modules', '3.7', '3.8', '3.9', 'type-crash']
    title = 'Reference-counting problem in sqlite'
    updated_at = <Date 2019-11-28.15:42:19.356>
    user = 'https://github.com/gescheit'

    bugs.python.org fields:

    activity = <Date 2019-11-28.15:42:19.356>
    actor = 'sir-sigurd'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-07-13.03:16:12.363>
    closer = 'benjamin.peterson'
    components = ['Extension Modules']
    creation = <Date 2019-06-20.11:14:59.579>
    creator = 'gescheit'
    dependencies = []
    files = ['48430']
    hgrepos = []
    issue_num = 37347
    keywords = ['patch']
    message_count = 9.0
    messages = ['346114', '346180', '346196', '346285', '346374', '346433', '346555', '347769', '347770']
    nosy_count = 5.0
    nosy_names = ['ghaering', 'benjamin.peterson', 'berker.peksag', 'gescheit', 'miss-islington']
    pr_nums = ['14268', '14725', '17413']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue37347'
    versions = ['Python 2.7', 'Python 3.5', 'Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9']

    @gescheit
    Copy link
    Mannequin Author

    gescheit mannequin commented Jun 20, 2019

    There are a couple of bugs in sqlite bindings have been found related to reference-counting.
    Short info:
    sqlite connection class uses a dict to keep references to callbacks. This mechanism is not suitable for objects which equals but not the same.
    con = sqlite3.connect()
    con.set_trace_callback(logger.debug)
    con.set_trace_callback(logger.debug) # logger.debug == logger.debug is True, but logger.debug is logger.debug is False because logger.debug bound method is creating in every call
    leads to segmentation fault during calling of trace_callback.
    My patch fixes this behavior by using a dedicated variable for keeping references to each callback and using dict indexed by function name in case of named callbacks(e.g. create_function()).
    Also, due to keeping objects in a variable or in a dict value, it is possible to use unhashable objects as callbacks. e.g. bpo-7478

    Long version:
    Sqlite under the hood use dict(called function_pinboard) to keep references to callbacks like progress_handler.
    It needs to decref callbacks objects after closing sqlite connection.
    This mechanism works tolerably(see bug with leaks) with functions but if you try to use bounded methods it causes a segmentation fault.
    Let see how it works.

    static PyObject *
    Custom_set_callback(CustomObject *self, PyObject* args)
    {
    	PyObject* display_str;
    	display_str = PyUnicode_FromFormat("set_callback called with cb=%R id=%i ob_refcnt=%i\n", args, args, args->ob_refcnt);
    	PyObject_Print(display_str, stdout, Py_PRINT_RAW);
    	if (PyDict_SetItem(self->function_pinboard, args, Py_None) == -1) return NULL;
    	//sqlite3_trace(self->db, _trace_callback, trace_callback);
    	self->callback_handler = args;
    	display_str = PyUnicode_FromFormat("set_callback done for cb=%R id=%i ob_refcnt=%i\n", args, args, args->ob_refcnt);
    	PyObject_Print(display_str, stdout, Py_PRINT_RAW);
    	Py_RETURN_NONE;
    }
    static PyObject *
    Custom_call_callback(CustomObject *self)
    {
    	PyObject* display_str;
    	display_str = PyUnicode_FromFormat("call with id=%i ob_refcnt=%i\n", self->callback_handler ,
        self->callback_handler->ob_refcnt);
    	PyObject_Print(display_str, stdout, Py_PRINT_RAW);
    	Py_RETURN_NONE;
    }
    Python code:
    >>>> class TEST:
            def log(self, msg=""):
                pass
    >>>> t = TEST()
    >>>> conn = Custom()
    >>>> conn.set_trace_callback(t.log)
    set_callback called with cb=<bound method TEST.log of <__main__.TEST object at 0x10bc60128>> id=196094408 ob_refcnt=1
    set_callback done for cb=<bound method TEST.log of <__main__.TEST object at 0x10bc60128>> id=196094408 ob_refcnt=2
    >>>> conn.set_trace_callback(t.log)
    set_callback called with cb=<bound method TEST.log of <__main__.TEST object at 0x10bc60128>> id=196095112 ob_refcnt=1
    set_callback done for cb=<bound method TEST.log of <__main__.TEST object at 0x10bc60128>> id=196095112 ob_refcnt=1
    conn.call()
    call with id=196095112 ob_refcnt=0

    After second conn.set_trace_callback(t.log) call, object t.log reference-count is not increased because 't.log in self->function_pinboard' returns True thus self->function_pinboard[t.log] is not replaced and t.log is not increfed, but it replaces old object in self->callback_handler.
    In the end, self->callback_handler keeps a pointer to t.log with ob_refcnt = 0.

    Also, there is no cleaning of self->function_pinboard. This leads to leaks every object passed as callback(see test_leak() in bug.py).

    @gescheit gescheit mannequin added 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump labels Jun 20, 2019
    @benjaminp
    Copy link
    Contributor

    Why not use sqlite3_create_function_v2 to register a destructor for the function that does Py_DECREF and dispense with the dictionaries entirely?

    @gescheit
    Copy link
    Mannequin Author

    gescheit mannequin commented Jun 21, 2019

    Because destructor can be registered only for particular functions. For example sqlite3_progress_handler() can't register destructor.

    @benjaminp
    Copy link
    Contributor

    But there can be only one progress handler per connection, so you can just keep an single reference in the struct yourself.

    @gescheit
    Copy link
    Mannequin Author

    gescheit mannequin commented Jun 24, 2019

    At my point of view, dererencing from sqlite destructor has next flaws:

    • relying on external library which can has bugs
    • using different routine for reference-counting(complicates code)
    • loosing support for old sqlite version

    Of cause, if it's necessary to use sqlite3_create_function_v2, I'm ready to make appropriate changes.

    @benjaminp
    Copy link
    Contributor

    The _sqlite3 module clearly is not free of bugs in this area. Binding the Python function to the exact lifetime of the sqlite function seems less likely to be buggy than replicating sqlite's function reference management independently. sqlite3_create_function_v2 was added in 2010; I think we can depend on it. Also, won't the callback pretty much be Py_DECREF(something)?

    @gescheit
    Copy link
    Mannequin Author

    gescheit mannequin commented Jun 25, 2019

    I've pushed changes bce7fdc.
    Could I ask for clarification on callback and Py_DECREF(something)?

    @benjaminp
    Copy link
    Contributor

    New changeset b9a0376 by Benjamin Peterson (gescheit) in branch 'master':
    closes bpo-37347: Fix refcount problem in sqlite3. (GH-14268)
    b9a0376

    @miss-islington
    Copy link
    Contributor

    New changeset 36101c2 by Miss Islington (bot) in branch '3.8':
    closes bpo-37347: Fix refcount problem in sqlite3. (GH-14268)
    36101c2

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes extension-modules C modules in the Modules dir type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants