classification
Title: The sqlite3 context manager does not work with isolation_level=None
Type: behavior Stage: needs patch
Components: Extension Modules, Library (Lib) Versions: Python 3.2, Python 3.3, Python 3.4, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: ghaering Nosy List: aymeric.augustin, ghaering, loewis, nagylzs, r.david.murray
Priority: normal Keywords: easy

Created on 2013-01-14 00:46 by r.david.murray, last changed 2015-08-19 09:34 by ghaering.

Messages (5)
msg179909 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-01-14 00:46
Its operation is also not particularly intuitive if isolation_level is not None, so its documentation needs some clarification.

Currently the transaction manager does nothing on enter, and does a commit or rollback on exit, depending on whether or not there was an exception inside the with block.  With isolation_level set to None, the sqlite3 library is in autocommit mode, so changes will get committed immediately inside the with, which is simply broken.

If isolation_level is not None, then the behavior of the transaction manager depends heavily on what happens inside the with block.  If the with block contains only the defined DQL statements (insert, update, delete, replace) and select statements, then things will work as expected.  However, if another statement (such as a CREATE TABLE or a PRAGMA) is included in the with block, an intermediate commit will be done and a new transaction started.

I propose to do two things to fix this issue: explain the above in the transactions manager docs, and have the context manager check to see if we are in isolation_level None, and if so, issue a begin (and then document that as well).

One question is, can the fix be backported?  It will change the behavior of code that doesn't throw an error, but most such code won't be doing what the author expected (run the with block inside a transaction...in pure autocommit mode the transaction manager is a no-op).  One place code could break is if someone figured out this issue and worked around it by explicitly starting a transaction before (or after) entering the with block.  In this case they would now get an error that a transaction cannot be started inside another.  I would think this is unlikely...the more obvious workaround would be to write a custom transaction manager, so I suspect that that is what is actually in the field.  But that's a (hopeful :) guess.

A fix for this problem would be to use 'savepoint' instead of 'begin' if the sqlite3 version supports it (it is apparently supported as of 3.6.8).

So, I'd like to see the fix, conditionally using SAVEPOINT, (once it written and tested) applied to all active python versions, but am open to the argument that it shouldn't be.
msg179912 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2013-01-14 01:04
"changes will get committed immediately inside the with, which is simply broken"

What do you mean by that?
A. Changes ought to be committed immediately, but are not; it is broken, and changes must be committed immediately.
- or -
B. What actually happens is that changes are committed immediately, and sqlite is incorrect in doing so.

Your discussion suggests B; in this case, I disagree that there is a bug. In auto-commit mode, it should really auto-commit, regardless of context managers. The context manager documentation doesn't claim otherwise.
msg179913 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-01-14 01:15
B, yes.

So you would view the connection context manager acting as an actual transaction manager as a new feature?  Would you be OK with adding that feature to the existing context manager in 3.4 (since currently the context manager is a noop in autocommit mode), or do you think we need to create a new context manager for this?  Or do we do as the issue that sparked this (issue 8145) suggested, and just document how to create your own?
msg219866 - (view) Author: Aymeric Augustin (aymeric.augustin) * Date: 2014-06-06 09:45
* Thesis *

I belive that using the connection as a context manager is an inadequate API for controlling transactions because it's very likely to result in subtly broken code.

As a consequence, my recommendation would be to deprecate this API.


* Argumentation *

If you nest a naive transaction context manager (BEGIN / COMMIT / ROLLBACK), you'll get very lousy transaction semantics. Look at this example:

with connection:   # outer transaction

    with connection:   # inner transaction
        do_X_in_db()

    do_Y_in_db()

    # once in a while, you get an exception there...

With this code, when you get an exception, X will be presevred in the database, but not Y. Most likely this breaks the expectations of the "outer transaction". Now, imagine the inner transaction in deep inside a third-party library, and you understand that this API is a Gun Pointed At Feet.

Of course, you could say "don't nest", but:

- this clashes with the expected semantics of Python context managers,
- it's unreasonable to expect Python users to audit all their lower level libraries for this issue!


Now, let's look at how popular database-oriented libraires handle this.

SQLAlchemy provides an explicit begin() method: http://docs.sqlalchemy.org/en/latest/core/connections.html#sqlalchemy.engine.Connection.begin

It also provides variants for nested transactions and two-phase commits.

Django provide an all-purpose atomic() context manager: 
https://docs.djangoproject.com/en/stable/topics/db/transactions/#django.db.transaction.atomic

That function takes a keyword argument, `savepoint`, to control whether a savepoint is emitted for nested transactions.


So it's possible to implement a safe, nestable context manager with savepoints. However:

- you need to provide more control, and as a consequence you cannot simply use the connection as a context manager anymore;
- it takes a lot of rather complex code. See Django's implementation for an example:
https://github.com/django/django/blob/stable/1.6.x/django/db/transaction.py#L199-L372

If you ignore the cross-database compatibility stuff, you're probably still looking at over a hundred lines of very stateful code...


That's why I believe it's better to leave this up to user code, and to stop providing an API that looks good for trivial use cases but that's likely to introduce subtle transactional integrity bugs.
msg248822 - (view) Author: Gerhard Häring (ghaering) * (Python committer) Date: 2015-08-19 09:34
I'm +1 on deprecating the connection manager
History
Date User Action Args
2015-08-19 09:34:17ghaeringsetmessages: + msg248822
2015-01-11 02:00:29ghaeringsetassignee: ghaering
2014-06-06 09:45:27aymeric.augustinsetnosy: + aymeric.augustin
messages: + msg219866
2013-01-14 01:15:56r.david.murraysetmessages: + msg179913
2013-01-14 01:04:20loewissetnosy: + loewis
messages: + msg179912
2013-01-14 00:47:59r.david.murraysetnosy: + nagylzs
2013-01-14 00:46:44r.david.murraycreate