This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: [pysqlite] Duplicate rows can be returned after rolling back a transaction
Type: behavior Stage: resolved
Components: Extension Modules Versions: Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Maxime Belanger, benjamin.peterson, berker.peksag, cary, erlendaasland, malin, palaviv, serhiy.storchaka, xtreak
Priority: normal Keywords: patch

Created on 2018-04-28 00:15 by cary, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 10250 open cary, 2018-10-30 21:20
Messages (9)
msg315862 - (view) Author: cary (cary) * Date: 2018-04-28 00:15
# Description
Rolling back a transaction causes all statements associated with that transaction to be reset, which allows the statements to be used again from the pysqlite statement cache. This can interact with various methods on `Cursor` objects to cause these statements to be reset again, possibly when they are in use by a different cursor.

This appears to be very similar to issue10513 and issue23129.

# Impact
Duplicate rows will be returned. Exceptions can be raised.

# Repro steps
  - Cursor *A* executes query *X*
  - Rollback occurs
  - Cursor *B* executes query *X*
  - Any of the following (and there might be other cases too):
    - Cursor *A* is closed
    - Cursor *A* is deallocated
    - Cursor *A* executes any other query.
  - Result: Cursor *B* returns duplicate rows.
  - Furthermore: Executing query *X* again afterwards raises `sqlite3.InterfaceError`

# Possible solutions
  - Similar to the solution for issue10513 and issue23129, we could remove `pysqlite_do_all_statements(self, ACTION_RESET, 1)` from `pysqlite_connection_rollback`. This fixes the given issue, but I'm not sure what the implications are for the rest of the system.

  - Do not reset `self->statement` in `Cursor` if `self->reset`. This is the fix we've adopted for now (through a local patch to our Python), but it's worth noting that this is rather brittle, and only works because `pysqlite_do_all_statements` is always called with `reset_cursors = 1`, and `self->reset` is not modified in too many places.

# Example
```
import sqlite3 as sqlite

if __name__ == '__main__':
    conn = sqlite.connect(":memory:")
    conn.executescript("""
        CREATE TABLE t(c);
        INSERT INTO t VALUES(0);
        INSERT INTO t VALUES(1);
        INSERT INTO t VALUES(2);
    """)

    curs = conn.cursor()
    curs.execute("BEGIN TRANSACTION")
    curs.execute("SELECT c FROM t WHERE ?", (1,))
    conn.rollback()

    # Reusing the same statement from the statement cache, which has been
    # reset by the rollback above.
    gen = conn.execute("SELECT c FROM t WHERE ?", (1,))

    # Any of the following will cause a spurious reset of the statement.
    curs.close()
    # curs.execute("SELECT 1")
    # del curs

    # Expected output: [(0,), (1,), (2,)]
    # Observed output: [(0,), (0,), (1,), (2,)]
    print(list(gen))

    # Raises `sqlite3.InterfaceError: Error binding parameter 0 - probably unsupported type.`
    conn.execute("SELECT c FROM t WHERE ?", (1,))
```
msg337584 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2019-03-10 00:25
New changeset 738c19f4c5475da186de03e966bd6648e5ced4c4 by Benjamin Peterson in branch 'master':
closes bpo-33376: Update to Unicode 12.0.0. (GH-12256)
https://github.com/python/cpython/commit/738c19f4c5475da186de03e966bd6648e5ced4c4
msg337585 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2019-03-10 00:26
Sorry, wrong bug.
msg393402 - (view) Author: Ma Lin (malin) * Date: 2021-05-10 14:40
Erlend, please take a look at this bug.
msg393435 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-05-10 21:35
I believe the former proposed solution is the correct solution. I'm digging though the pysqlite git history (both in the original repo and in CPython), the SQLite changelogs, and different test suites to prove me wrong :) I'm using bpo-44092 to track my findings.
msg407185 - (view) Author: Ma Lin (malin) * Date: 2021-11-28 04:26
Since 243b6c3b8fd3144450c477d99f01e31e7c3ebc0f (21-08-19), this bug can't be reproduced.

In `pysqlite_do_all_statements()`, 243b6c3 resets statements like this:

    sqlite3_stmt *stmt = NULL;
    while ((stmt = sqlite3_next_stmt(self->db, stmt))) {
        if (sqlite3_stmt_busy(stmt)) {
            (void)sqlite3_reset(stmt);
        }
    }
    
But the `pysqlite_Statement.in_use` flag is not reset.
In `_pysqlite_query_execute()` function, if `pysqlite_Statement.in_use` flag is 1, it creates a new `pysqlite_Statement` instance. So this line will use a new statement:

    gen = conn.execute("SELECT c FROM t WHERE ?", (1,))

The duplicate row is from `pysqlite_Cursor.next_row` before 3df0fc89bc2714f5ef03e36a926bc795dcd5e05a (21-08-25).

A digressive suggestion is whether it can be changed like this, and add a check for resetting statement. So that statements are not allowed to be reset by other Cursors, which may improve code robust:

    typedef struct
    {
        ...
-       int in_use;
+       pysqlite_Cursor *in_use; // points to the attached cursor
        ...
    } pysqlite_Statement;
msg407187 - (view) Author: Ma Lin (malin) * Date: 2021-11-28 04:56
This issue is not resolved, but was covered by a problematic behavior.
Maybe this issue will be solved in issue44092, I'll study that issue later.
msg407236 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-11-28 22:12
> Maybe this issue will be solved in issue44092

Yes, this will be resolved in bpo-44092. Hopefully I'll be able to land it sooner than later.
msg409614 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2022-01-03 20:03
GH-26026 / bpo-44092 has removed the old workaround where pending statements were reset before a rollback. Since version 3.7.11, SQLite no longer has any issues with pending statements and rollbacks, so we remove the workaround, since we require SQLite 3.7.15 or newer in CPython.

As a side effect, pysqlite_do_all_statements() has now been removed, as it is unused code.

Following GH-26026, this issue is no longer an issue; I close it as fixed.


Ma Lin: if you have problems with the LRU cache, please open a new issue.
History
Date User Action Args
2022-04-11 14:58:59adminsetgithub: 77557
2022-01-03 20:03:46erlendaaslandsetpull_requests: - pull_request12242
2022-01-03 20:03:08erlendaaslandsetstatus: open -> closed
resolution: fixed
messages: + msg409614

stage: resolved
2021-11-28 22:12:11erlendaaslandsetmessages: + msg407236
2021-11-28 04:56:54malinsetmessages: + msg407187
2021-11-28 04:48:56rhettingersetstage: resolved -> (no value)
2021-11-28 04:26:09malinsetmessages: + msg407185
2021-05-10 21:35:59erlendaaslandsetmessages: + msg393435
2021-05-10 14:40:38malinsetnosy: benjamin.peterson, berker.peksag, serhiy.storchaka, malin, Maxime Belanger, palaviv, cary, xtreak, erlendaasland
messages: + msg393402
2021-02-23 08:04:08malinsetnosy: + erlendaasland
2019-09-04 01:48:12Maxime Belangersetversions: + Python 3.9, - Python 2.7, Python 3.7
2019-03-10 00:26:41benjamin.petersonsetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg337585
2019-03-10 00:25:58benjamin.petersonsetstatus: open -> closed

nosy: + benjamin.peterson
messages: + msg337584

resolution: fixed
stage: patch review -> resolved
2019-03-09 23:51:50benjamin.petersonsetpull_requests: + pull_request12242
2019-02-15 13:00:21malinsetnosy: + malin
2019-02-13 15:22:53cheryl.sabellasetversions: - Python 3.4, Python 3.5, Python 3.6
2018-10-30 23:50:52Maxime Belangersetnosy: + Maxime Belanger
2018-10-30 23:23:12xtreaksetnosy: + xtreak
2018-10-30 21:20:44carysetkeywords: + patch
stage: patch review
pull_requests: + pull_request9564
2018-05-18 12:54:06palavivsetnosy: + palaviv
2018-04-28 06:39:50serhiy.storchakasetnosy: + berker.peksag, serhiy.storchaka
2018-04-28 00:15:11carycreate