classification
Title: Add an autocommit property to sqlite3.Connection with a PEP 249 compliant manual commit mode and migrate
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: erlendaasland, ghaering, james.oldfield, lemburg, maggyero, r.david.murray, zzzeek
Priority: normal Keywords:

Created on 2020-01-26 19:17 by maggyero, last changed 2021-01-06 13:21 by zzzeek.

Messages (12)
msg360732 - (view) Author: Géry (maggyero) * Date: 2020-01-26 19:17
In non-autocommit mode (manual commit mode), the sqlite3 database driver implicitly issues a BEGIN statement before each DML statement (INSERT, UPDATE, DELETE, REPLACE) not already in a database transaction, BUT NOT before DDL statements (CREATE, DROP) nor before DQL statements (SELECT) (cf. https://github.com/python/cpython/blob/master/Modules/_sqlite/cursor.c#L480):

```
    /* We start a transaction implicitly before a DML statement.
       SELECT is the only exception. See #9924. */
    if (self->connection->begin_statement && self->statement->is_dml) {
        if (sqlite3_get_autocommit(self->connection->db)) {
            result = _pysqlite_connection_begin(self->connection);
            if (!result) {
                goto error;
            }
            Py_DECREF(result);
        }
    }
```

Like Mike Bayer explained in issue #9924, this is not what other database drivers do, and this is not PEP 249 compliant (Python Database API Specification v2.0), as its author Marc-André Lemburg explained (cf. https://mail.python.org/pipermail/db-sig/2010-September/005645.html):

> Randall Nortman wrote:
> # PEP 249 says that transactions end on commit() or rollback(), but it
> # doesn't explicitly state when transactions should begin, and there is
> # no begin() method. 
>
> Transactions start implicitly after you connect and after you call
.commit() or .rollback(). They are not started for each statement.
>
> # I think the implication is that transactions begin
> # on the first execute(), but that's not explicitly stated.  At least
> # one driver, pysqlite2/sqlite3, does not start a transaction for a
> # SELECT statement.  It waits for a DML statement (INSERT, UPDATE,
> # DELETE) before opening a transaction.  Other drivers open transactions
> # on any statement, including SELECT.
> #
> # My question for the DB-SIG is: Can I call it a bug in pysqlite2 that
> # it does not open transactions on SELECT?  Should the spec be amended
> # to make this explicit?  Or are both behaviors acceptable, in which
> # case perhaps a begin() method needs to be added for when the user
> # wants control over opening transactions?
>
> I should probably add a note to PEP 249 about this.

Aymeric Augustin said in issue #10740:

> While you're there, it would be cool to provide "connection.autocommit = True" as an API to enable autocommit, because "connection.isolation_level = None" isn't a good API at all -- it's very obscure and has nothing to do with isolation level whatsoever.

So I suggest that we introduce a new autocommit property and use it to enable a truly PEP 249 compliant manual commit mode (that is to say with transactions starting implicitly after connect(), commit() and rollback() calls, allowing transactional DDL and DQL):

```
autocommit = True  # enable the autocommit mode
autocommit = False  # disable the autocommit mode (enable the new PEP 249 manual commit mode)
autocommit = None  # fallback to the commit mode set by isolation_level
```

I also suggest that we use this new PEP 249 manual commit mode (with transactional DDL and DQL) by default and drop the old manual commit mode (without transactional DDL and DQL). We could use the following migration strategy:

1. During the deprecation period:

- Add the new autocommit property with the value None by default, so that the old manual commit mode is still the default.
- Add a deprecation warning for the value None of the autocommit property, in favor of the other values True and False. It will prompt users who enabled the autocommit mode with isolation_level = None to use autocommit = True instead, and users who disabled the autocommit mode (that is to say users who enabled the old manual commit mode) with isolation_level = DEFERRED/IMMEDIATE/EXCLUSIVE to use autocommit = False instead AND add to their code the potentially missing commit() calls required by the new PEP 249 manual commit mode.

2. After the deprecation period:

- Set the value of the autocommit property to False by default, so that the new PEP 249 manual commit mode becomes the new default.
- Remove the value None of the autocommit property and its deprecation warning.
- Remove the value None of the isolation_level property, so that the old manual commit mode disappears.
msg360734 - (view) Author: Géry (maggyero) * Date: 2020-01-26 19:33
> - Remove the value None of the isolation_level property, so that the old manual commit mode disappears.

Correction:

- Remove the value None of the isolation_level property.
msg360735 - (view) Author: Géry (maggyero) * Date: 2020-01-26 22:07
> - Remove the value None of the autocommit property and its deprecation warning.

Correction:

- Remove the value None of the autocommit property and its deprecation warning, so that the old manual commit mode disappears.
msg384383 - (view) Author: James Oldfield (james.oldfield) Date: 2021-01-05 11:27
If this ever gets implemented, "autocommit" would be a terrible name for it. That word has a very specific meaning in SQLite, which is essentially the same as "not in a transaction started with BEGIN ...". At the moment, if you want to explicitly control when transactions start (a good idea considering how confusing the current behaviour is) then you would set isolation_mode to None and manually start a transaction with `execute("BEGIN")` - at which point you are NOT in autocommit mode, until you commit or rollback. According to this proposal, if I want manual control over transactions, I would set `conn.autocommit = True`, even though I *don't* want autocommit mode (according to SQLite's definition)!
msg384397 - (view) Author: Géry (maggyero) * Date: 2021-01-05 14:00
> If this ever gets implemented, "autocommit" would be a terrible name for it. That word has a very specific meaning in SQLite, which is essentially the same as "not in a transaction started with BEGIN ...".

Yes if you are talking about SQLite, the database ENGINE: the SQL statements inside BEGIN and COMMIT are said to be in manual commit mode, while SQL statements outside are said to be in autocommit mode. So the autocommit mode is the default mode for database ENGINES.

But here I am talking about SQLite3, the Python database DRIVER. You do not issue BEGIN statements with database DRIVERS, they are issued implicitly, so that the manual mode is the default mode for database DRIVERS.

Cf. this Stack Overflow answer for more details: https://stackoverflow.com/a/48391535/2326961

> At the moment, if you want to explicitly control when transactions start (a good idea considering how confusing the current behaviour is)

No, you do not want that at the database DRIVER level. Because like Mike Bayer explained in issue #9924, this is not what other database DRIVERS do, and this is not PEP 249 compliant (Python Database API Specification v2.0), as its author Marc-André Lemburg explained (cf. https://mail.python.org/pipermail/db-sig/2010-September/005645.html):

> Randall Nortman wrote:
> # PEP 249 says that transactions end on commit() or rollback(), but it
> # doesn't explicitly state when transactions should begin, and there is
> # no begin() method. 
>
> Transactions start implicitly after you connect and after you call .commit() or .rollback(). They are not started for each statement.
msg384414 - (view) Author: James Oldfield (james.oldfield) Date: 2021-01-05 16:40
> Yes if you are talking about SQLite, the database ENGINE

I sure was! In this comment I will stick to saying either "SQLite engine" or "sqlite3 driver" as appropriate, hopefully that will be clearer.

> But here I am talking about SQLite3, the Python database DRIVER

Yep, I was aware of that. I was trying to say, please don't use the word "autocommit" in the sqlite3 driver when that word has a related but different meaning in the SQLite engine.

> You do not issue BEGIN statements with database DRIVERS, they are issued implicitly, so that the manual mode is the default mode for database DRIVERS.

This sentence isn't literally true for several reasons (you say "you do not" but I certainly do, you use of "with database drivers" is dubious, and you seem to have causality in the wrong direction). I think there might be a bit of a language barrier here, so I hope you don't mind if I leave this to one side.

> Cf. this Stack Overflow answer for more details: https://stackoverflow.com/a/48391535/2326961

I am fully, and painfully, aware of when the sqlite3 driver code will automatically issue BEGIN statements to the engine. I have no need to read StackOverflow answers about it, I have read the C source code to sqlite3 (and pysqlite) directly. I spent more time than I care to admit recently doing that! In fact that happened as a result of reading several confusing StackOverflow answers about transactions (maybe I'll write my own and add to the confusion...)

What that answer doesn't mention is that, even with even with isolation_mode=None, it's perfectly possible to start a transaction, which takes the SQLite engine out of autocommit mode. This is fully and intentionally supported by the sqlite3 driver, and the original author has said so and even recommended. For example, let's look at this code:

    conn = sqlite3.connect(path, isolation_mode=None)
    conn.execute("INSERT INTO test (i) VALUES (?)", (1,))  # stmt 1
    foo = conn.execute("SELECT * FROM test").fetchall()    # stmt 2
    conn.execute("BEGIN")                                  # stmt 3
    conn.execute("INSERT INTO test (i) VALUES (?)", (4,))  # stmt 4
    bar = conn.execute("SELECT * FROM test").fetchall()    # stmt 5
    conn.execute("COMMIT")                                 # stmt 6

Statement 1 and statement 2 execute using the SQLite engine's autocommit mode. Statements 3 through to 5 execute in a single transaction and do *not* use the SQLite engine's autocommit mode. (Technically statement 6 actually does use autocommit because COMMIT uses the autocommit mechanism under the hood ... but let's forget about that!)

Under your proposal, the first line would be changed to say "autocommit=True", even though not all the code below is in autocommit mode (according to the SQLite engine's definition). What's more, I could insert this line of code between statements 3 and 6:

    print("Autocommit mode?", conn.autocommit)

And it would print True even though autocommit mode is off!

Now, maybe your reaction is that "autocommit mode *in the driver*" can have a different meaning from "autocommit mode *in the engine*". Yes, it can, but that doesn't mean it should! Please, just pick a different name! For example, say "manual mode" (instead of autocommit=True) or "auto-start-transaction mode" (instead of autocommit=False).


> No, you do not want that at the database DRIVER level. Because like Mike Bayer explained in issue #9924, this is not what other database DRIVERS do, and this is not PEP 249 compliant 

The "that" you are referring to here was when I said that I prefer to set isolation_level = None, like the above code snippet. Do not tell me that it is not what I want; it certainly IS what I want! I do not want the sqlite3 driver getting in the way between me and the SQLite engine. Many future users of the sqlite3 driver are likely to feel the same way, and the API should allow that to happen clearly.
msg384417 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2021-01-05 17:25
I think there's a bit of a misunderstanding here. When relying on
a DB-API driver's transaction API, you are not allowed to issue
separate transaction commands to the DB backend via the .execute()
methods. You have to use conn.commit() and conn.rollback().

The DB-API wants drivers to have connections default to transactional
behavior and implicitly start transactions when opening the connection
as well as when ending one (via .commit() or .rollback()) - provided the
backend does support transactions.

It also suggests that a method may be used to set the transactional
behavior.

Now, historically, many drivers have not always used methods for
this, but instead (or in addition) provide a property
connection.autocommit to allow setting or querying the current
state. Over time, this has become more or less a standard.

Aside: This is a bit unfortunate, since users would not expect
exceptions from such properties (e.g. network errors), but this where
things have moved and it's hard to change.

I guess the SQLite driver does not start a new transaction for
SELECTs, since these are usually read-only, but don't know whether
this still holds today (e.g. think of UDFs running INSERTs or UPDATEs).

For the same reason, removing the SELECT "optimization" may cause
a backwards incompatible change, which can be tricky to identify
and cause corruption of data (in this case, data not written to
the database, where it previously was written).
msg384421 - (view) Author: Géry (maggyero) * Date: 2021-01-05 18:04
@james.oldfield

> What that answer doesn't mention is that, even with even with isolation_mode=None, it's perfectly possible to start a transaction, which takes the SQLite engine out of autocommit mode.

Exactly, so since autocommit=True is equivalent to isolation_mode=None I do not see why you the name ‘autocommit’ would be a problem. As you said, when you issue BEGIN, you leave autocommit mode.

> Under your proposal, the first line would be changed to say "autocommit=True", even though not all the code below is in autocommit mode (according to the SQLite engine's definition).

What is the difference with isolation_mode=None which also means autocommit mode?

> What's more, I could insert this line of code between statements 3 and 6:
>    print("Autocommit mode?", conn.autocommit)
> And it would print True even though autocommit mode is off!

No, because the autocommit property would be automatically updated to False at conn.execute("BEGIN"), which is the standard behaviour as @lemburg explained.

@lemburg

> I guess the SQLite driver does not start a new transaction for SELECTs, since these are usually read-only

Nor for DDL statements (CREATE, DROP).

> For the same reason, removing the SELECT "optimization" may cause
> a backwards incompatible change, which can be tricky to identify
> and cause corruption of data (in this case, data not written to
> the database, where it previously was written).

Since DQL statements (SELECT) are read-only, maybe we could keep the optimization and start transactions implicitly only for DDL statements (CREATE, DROP)?
msg384425 - (view) Author: mike bayer (zzzeek) * Date: 2021-01-05 18:58
> Under your proposal, the first line would be changed to say "autocommit=True", even though not all the code below is in autocommit mode (according to the SQLite engine's definition). What's more, I could insert this line of code between statements 3 and 6:

  >  print("Autocommit mode?", conn.autocommit)

As Marc-Andre indicated, this is in fact how "autocommit" behaves on other drivers, including:

    psycopg2 - uses either connection.autocommit, or extensions.ISOLATION_LEVEL_AUTOCOMMIT
    pymysql - uses conn.autocommit(True|False)
    mysqldb - uses conn.autocommit(True|False)
    cx_Oracle - uses conn.autocommit
    pyodbc - uses conn.autocommit

With all of the above drivers, one can emit "BEGIN" and "COMMIT" using
connection.execute(), and within the scope of that BEGIN/COMMIT pair, the 
database "engine" itself is in a transaction.  The "driver" however remains in 
"autocommit" mode.   This mode specifically means the driver is not getting
involved in starting and stopping transactions. 

As Marc mentions, we're not supposed to be emitting "BEGIN" and "COMMIT" on 
the driver, but none of them get in the way of us doing so, and additionally
most databases support additional options for the "BEGIN/START TRANSACTION" phase
that aren't available in the DBAPIs so sometimes we don't have much choice at least for the "BEGIN" command.

Here's an example using psycopg2, where the timestamp now() will freeze
when we're in a transaction started using manual "BEGIN"/ "COMMIT", while .autocommit stays True, and otherwise match statement execution time if we're not:

    >>> import psycopg2
    >>> conn = psycopg2.connect(user="scott", password="tiger", host="localhost", database="test")
    >>> conn.autocommit = True
    >>> cursor = conn.cursor()
    >>> cursor.execute("SELECT 1")
    >>> cursor.execute("select now() = statement_timestamp()")
    >>> cursor.fetchall()
    [(True,)]
    >>> cursor.execute("BEGIN")
    >>> cursor.execute("select now() = statement_timestamp();")
    >>> cursor.fetchall()
    [(False,)]  # we're in a transaction
    >>> conn.autocommit   # still in driver-level autocommit
    True
    >>> cursor.execute("COMMIT")
    >>> cursor.execute("select now() = statement_timestamp();")
    >>> cursor.fetchall()
    [(True,)]

For SQLAlchemy we already support pysqlite's "isolation_level=None" to implement "autocommit" so this issue does not affect us much, but the meaning of the term "autocommit" at the driver level shouldn't be controversial at this point as there's a lot of precedent.   "connection.autocommit" does not refer to the current transactional state of the database, just the current preference set upon the driver itself.
msg384485 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2021-01-06 10:46
On 05.01.2021 19:04, Géry wrote:
> 
> @lemburg
> 
>> I guess the SQLite driver does not start a new transaction for SELECTs, since these are usually read-only
> 
> Nor for DDL statements (CREATE, DROP).

Those are definitely changing the database and AFAIK SQLite
does support DDLs in transactions (including rolling them back
if needed).

Looking at the _sqlite code, the module does indeed only start
a transaction for INSERT, UPDATE, DELETE and REPLACE, with
"starting a transaction" meaning that it inserts a "BEGIN"
(or one of the txn isolation alternatives) before the statement:

https://github.com/python/cpython/blob/3.9/Modules/_sqlite/cursor.c#L489

This is also documented:

https://docs.python.org/3/library/sqlite3.html#controlling-transactions

I wonder why the module does not implement this properly, but I also
believe it's too late to change.

I guess what could be done is to add a connection.autocommit,
defaulting to None, meaning "use the pre-3.10 behavior".

If this is set to False, the module could then implement the
correct way of handling transactions, which means:

a) start a new transaction when the connection is opened
b) start a new transaction after .commit() and .rollback()
c) don't start new transactions anywhere else
d) run an implicit .rollback() when the connection closes

The code could even check for "BEGIN", "ROLLBACK" and "COMMIT"
text in the .execute() and issues a warning when connection.autocommit
is set to True or False.

When set to True, the module would set the SQLite autocommit
flag and also issues warnings for the txn statements. .rollback()
would issue an exception and .commit() pass silently.

>> For the same reason, removing the SELECT "optimization" may cause
>> a backwards incompatible change, which can be tricky to identify
>> and cause corruption of data (in this case, data not written to
>> the database, where it previously was written).
> 
> Since DQL statements (SELECT) are read-only, maybe we could keep the optimization and start transactions implicitly only for DDL statements (CREATE, DROP)?

See https://sqlite.org/c3ref/stmt_readonly.html.

SELECT are usually read-only, but not always. Since SQLite does
support UDFs (user defined functions), it is possible that a call
to such a function does change the database.
msg384499 - (view) Author: James Oldfield (james.oldfield) Date: 2021-01-06 13:05
There's some confusion here over what autocommit=True would do. I believe the last three comments give three different interpretations! Géry said conn.autocommit would change to False when I start a transaction with execute("BEGIN"), Mike said it wouldn't (because it represents the driver's state, not the engine's, by analogy with other DB API drivers), and Marc-Andre says execute("BEGIN") wouldn't be allowed in the first place (or at least it would issue a warning).

To reiterate, the ability to control transactions manually is already supported in the sqlite3 driver, in the form of isolation_mode=None. My first request is simply that **this ability continues to exist**. This functionality was implemented deliberately - the original author of pysqlite recommended this usage, and care has been taken over the years not to break it. Please do not point out that this is not DB API compliant; I know that, and I just don't care! So long as DB API compliant usage is _also_ supported, even the default, that doesn't prevent this other mode from existing too. Many others are using the mode, even if they are not commenters here, so I don't believe it is feasible to break or remove this functionality, even if you're not a fan.

My second request was: feel free to rename this option from "isolation_mode=None" to something else if you wish, but please don't call it "autocommit=True" because that's just too confusing. I feel like the confusion in the comments above justifies this point of view.

As I see it, that leaves two options:

Option 1: Suck it up and use autocommit=True as the option name. It's confusing, but there's so much precedent that it has to be so. This is Mike Bayer's suggestion (except he didn't say it was confusing, that's just my commentary). I think that this option is only feasible if conn.autocommit only refer's the driver's state, not the underlying engine's state, confusing though that is i.e. once set to true it would *always* be true, even if a transaction is started.

Option 2: Reserve autocommit=True for the underlying SQLite engine autocommit mode. That means detecting when there's an attempt to use execute("BEGIN") or similar, and then issuing a warning or error. It also means supplying some other, third, option for what I'm asking (like today's isolation_mode=None).

Although option 2 is closer to what I originally requested, I do worry it means that the non-DBAPI mode will appear unsupported and fall into neglect. If the API for accessing it is to set autocommit=None, to mean legacy behaviour, and then also isolation_mode=None to mean the type of legacy behaviour, then it doesn't look like the most recommended API ever. And yet, for those that don't care about DB API (which I imagine is most users of the sqlite3 driver), this is probably the best API to use.

So I reluctantly agree that option 1, using autocommit=True, is actually best overall. I would ask that there is at least a note in the documentation so that it's clear this is allowed to work. Something like this:

    If autocommit=True then the sqlite3 module will never automatically start transactions. The underlying SQLite database engine operates in autocommit mode whenever no transactions are active, so the net effect of this is to use SQLite's autocommit mode [1].

    Note that, when autocommit=True, the sqlite3 module will not intercept and stop a statement to explicitly start a transaction, such as with execute("BEGIN"). In that case, a transaction is started and the underlying SQLite engine is no longer in autocommit mode. (The sqlite3 Connection object will still report autocommit=True; this does not indicate that the SQLite engine is autocommit mode, just that the sqlite3 module is not going to implicitly start any transactions.)

    The connection commit() and rollback() methods may be used for transactions started explictly when autocommit=True, and the connection may be used as a context manager, just as it can be when autocommit=False. If no transaction is currently active then those methods silent pass with no effect.

[1] https://sqlite.org/lang_transaction.html#implicit_versus_explicit_transactions

Side note: When I started down this rabbit hole several weeks ago, I repeatedly came across the extremely confusing phrase "SQLite operates in autocommit mode by default". It took me a while to realise that autocommit is not a flag that it is possible to turn off on a connection *when you open it*. The text I used above, "The underlying SQLite database engine operates in autocommit mode whenever no transactions are active" was carefully chosen and I consider it to be much clearer, regardless of whatever else ends up happening.
msg384501 - (view) Author: mike bayer (zzzeek) * Date: 2021-01-06 13:21
I think this issue just discusses the naming of an attribute called ".autocommit".  for the discussion for SQLite's unusual starting of transactions, that's all in two other issues:

https://bugs.python.org/issue9924 

https://bugs.python.org/issue10740

so I would encourage folks to read those discussions.  at issue is the limitation of SQLite that it locks the whole file for transactions, which is the main rationale for why SQLite is hesitant to begin a transaction.  however, without configurability, this means it's not compatible with SAVEPOINT or serializable isolation levels.   when users want to use those two features we have them set isolation_level=None and emit "BEGIN" on the connection directly.    the connection.commit() and connection.rollback() methods continue to be functional
History
Date User Action Args
2021-01-06 13:21:56zzzeeksetmessages: + msg384501
2021-01-06 13:05:28james.oldfieldsetmessages: + msg384499
2021-01-06 10:46:25lemburgsetmessages: + msg384485
2021-01-05 18:58:08zzzeeksetmessages: + msg384425
2021-01-05 18:04:58maggyerosetmessages: + msg384421
2021-01-05 17:25:16lemburgsetmessages: + msg384417
2021-01-05 16:40:59james.oldfieldsetmessages: + msg384414
2021-01-05 14:00:17maggyerosetmessages: + msg384397
2021-01-05 11:27:54james.oldfieldsetnosy: + james.oldfield
messages: + msg384383
2020-06-03 21:05:34erlendaaslandsetnosy: + erlendaasland
2020-01-26 22:07:24maggyerosetmessages: + msg360735
2020-01-26 19:41:20SilentGhostsetversions: + Python 3.9, - Python 3.8
2020-01-26 19:33:18maggyerosetmessages: + msg360734
2020-01-26 19:24:54maggyerosettitle: Add an autocommit property to sqlite3.Connection with truly PEP 249 compliant manual commit mode and migrate -> Add an autocommit property to sqlite3.Connection with a PEP 249 compliant manual commit mode and migrate
2020-01-26 19:17:09maggyerocreate