classification
Title: sqlite3 connect parameter "check_same_thread" not documented
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Dave Sawyer, berker.peksag, docs@python, orsenthil, python-dev, takluyver
Priority: normal Keywords: patch

Created on 2016-05-24 21:36 by Dave Sawyer, last changed 2016-06-03 19:24 by takluyver. This issue is now closed.

Files
File name Uploaded Description Edit
doc_parm.patch Dave Sawyer, 2016-05-24 21:36 review
issue27113-rev1.patch orsenthil, 2016-05-30 02:22 review
issue27113-rev2.patch orsenthil, 2016-05-30 03:51
issue27113-rev2.patch orsenthil, 2016-05-31 03:06
issue27113-rev3.patch Dave Sawyer, 2016-06-02 21:09 review
Messages (17)
msg266281 - (view) Author: Dave Sawyer (Dave Sawyer) * Date: 2016-05-24 21:36
The sqlite3.connect method has 6 parameters. 5 of them are documented.  See below and search for "check_same_thread".  Patch adds documentation for this parameter.

sqlite3.connect(database[, timeout, detect_types, isolation_level, check_same_thread, factory, cached_statements])
Opens a connection to the SQLite database file database. You can use ":memory:" to open a database connection to a database that resides in RAM instead of on disk.

When a database is accessed by multiple connections, and one of the processes modifies the database, the SQLite database is locked until that transaction is committed. The timeout parameter specifies how long the connection should wait for the lock to go away until raising an exception. The default for the timeout parameter is 5.0 (five seconds).

For the isolation_level parameter, please see the Connection.isolation_level property of Connection objects.

SQLite natively supports only the types TEXT, INTEGER, REAL, BLOB and NULL. If you want to use other types you must add support for them yourself. The detect_types parameter and the using custom converters registered with the module-level register_converter() function allow you to easily do that.

detect_types defaults to 0 (i. e. off, no type detection), you can set it to any combination of PARSE_DECLTYPES and PARSE_COLNAMES to turn type detection on.

By default, the sqlite3 module uses its Connection class for the connect call. You can, however, subclass the Connection class and make connect() use your class instead by providing your class for the factory parameter.

Consult the section SQLite and Python types of this manual for details.

The sqlite3 module internally uses a statement cache to avoid SQL parsing overhead. If you want to explicitly set the number of statements that are cached for the connection, you can set the cached_statements parameter. The currently implemented default is to cache 100 statements.
msg266352 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-05-25 13:39
This is a duplicate of issue 16509, but I'm going to keep this open since there is a patch attached here.
msg266664 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2016-05-30 02:22
Thank you for the patch, Dave.

Following the language style of rest of the docs, check_same_thread, should be explained in a passive voice.

Also, mentioning the version requirement at this place, does not seem suitable to me. When we provide this feature, it assumed that the underlying sqlite linked has the support for it.

Also, (3.3.1 alpha) of sqlite was released in 2006-01-16 
Database connections can now be used by multiple threads, not just the thread in which they were created.

Here is a modified patch which captures the main idea from your patch. Please review this and share your suggestions.
msg266672 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2016-05-30 03:51
We should be able to simplify this further based on the comments in the review for the first patch.

Please find my patch issue27113-rev2.patch attached.
msg266679 - (view) Author: Dave Sawyer (Dave Sawyer) * Date: 2016-05-30 06:10
The revised patch says "connections" plural for true and "connection" singular for false. How about "the connection" since the method returns a connection. I'm wondering though about the lack of explanation or WHY for this parameter. As written it seems like the default value was poorly chosen and everyone should just set it to true. Likewise, if everyone should set it to true, why is it even provided as an option?
msg266731 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2016-05-31 03:06
Thanks for the review. Made it both singular in the updated patch.

* As for why the default value was chosen to be to True, I think, the reason is pysqlite, from sqlite3 module was derived, did not support this option when it was introduced. I still do not find that in pysqlite external library.

That being said, if we want to change the default to looking into multiple threads, we could attempt that change and introduce it. This will be a separate feature request ticket.
msg266952 - (view) Author: Dave Sawyer (Dave Sawyer) * Date: 2016-06-02 21:09
The user probably has a recent enough version. This is guaranteed on Windows since Python bundles 3.6 or later. On mac or Linux it will use the version installed on the machine. I'll make a separate patch to check the version in sqlite3.py so it will give an error if attempted on too early a version. That's better than asking the user to verify ahead of time. The serialized mode is default on both Mac and Windows so we can probably skip validating that. I did like mentioning the user needs to serialize the writes. They could use one thread for writing only or use locking. So, I just said to serialize.
msg266954 - (view) Author: Dave Sawyer (Dave Sawyer) * Date: 2016-06-02 21:12
Changed doc to note that not only must it be used on 1 thread if true, but that thread must be the thread that created it.
msg266955 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2016-06-02 21:14
Dave, your patch looks good to me.

Did you see the comment from Trevor on your first patch?

>> According to the sqlite docs (http://www.sqlite.org/threadsafe.html), the
default mode is 'serialized', which does the necessary locking for threads to
share a connection. Unless Python is overriding this to use one of the less
protected modes, it shouldn't be necessary to use extra locks around it.

This had prompted me to remove extra line on asking users to take care of synchronization.
msg266960 - (view) Author: Thomas Kluyver (takluyver) * Date: 2016-06-02 21:39
Having watched the video of Dave's PyCon talk (thanks Dave), I think he's talking about using locking to control transactions, which presumably the sqlite bindings don't handle by themselves. But I don't *think* you need manual locking just to maintain database integrity. If that's the case, it's probably helpful to clarify it.

I'd probably suggest that if you want to write from multiple threads and control transactions (i.e. more than just individual INSERTs fired from different threads), it's probably easier to use separate connections and let sqlite handle the synchronisation than to synchronise them yourself.
msg266961 - (view) Author: Thomas Kluyver (takluyver) * Date: 2016-06-02 21:40
That comment contained far too much 'probably'. Time for me to sleep...
msg267010 - (view) Author: Dave Sawyer (Dave Sawyer) * Date: 2016-06-03 00:43
Hi Thomas and Senthil, for the serialized setting I mentioned earlier "The serialized mode is default on both Mac and Windows so we can probably skip validating that. I did like mentioning the user needs to serialize the writes. They could use one thread for writing only or use locking. So, I just said to serialize."

to go into more detail, when serialize is set you don't need to worry that using multiple threads will WRECK your database. But, you still need to make sure your own writes get serialized or you will mix transactions together. Each connection only has one transaction in progress so if you share that connection between threads... watch out for your writes. Now if only one thread in your application writes and the other threads read you have no serialization issues to worry about. If two threads make updates without a threading.Lock than you can get 1.5 changes committed (and perhaps 0.5 later or if that had an error you might rollback that 0.5 of a transaction). You could also get the first thread to commit both changes and when the later thread calls commit get an error that is "no transaction in progress."

I thought "serialize" neatly encapsulated the two most common strategies here for shared connections - either a dedicated writer thread or using a python lock around your transactions. As Thomas points out you can also opt to give each thread its own connection... but then you don't need the check_same_thread parameter set to False as you are not sharing the connection.
msg267065 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2016-06-03 07:00
Thanks for the explanation, Dave. Completely agree with those points. Let's go ahead with the latest patch.
msg267066 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-06-03 07:04
New changeset af1752e51a89 by Senthil Kumaran in branch '3.5':
issue27113 - Document "check_same_thread" parameter in sqlite3.connect api.
https://hg.python.org/cpython/rev/af1752e51a89

New changeset 17c8da643065 by Senthil Kumaran in branch 'default':
[merge from 3.5] - issue27113 - Document "check_same_thread" parameter in sqlite3.connect api.
https://hg.python.org/cpython/rev/17c8da643065
msg267067 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2016-06-03 07:05
Thanks for your contribution. It's now documented.
msg267143 - (view) Author: Dave Sawyer (Dave Sawyer) * Date: 2016-06-03 19:17
hurray! My first commit
msg267146 - (view) Author: Thomas Kluyver (takluyver) * Date: 2016-06-03 19:24
Congrats! :-)
History
Date User Action Args
2016-06-03 19:24:50takluyversetmessages: + msg267146
2016-06-03 19:17:55Dave Sawyersetmessages: + msg267143
2016-06-03 07:05:05orsenthilsetstatus: open -> closed
resolution: fixed
messages: + msg267067

stage: patch review -> resolved
2016-06-03 07:04:36python-devsetnosy: + python-dev
messages: + msg267066
2016-06-03 07:00:02orsenthilsetmessages: + msg267065
2016-06-03 00:43:05Dave Sawyersetmessages: + msg267010
2016-06-02 21:40:13takluyversetmessages: + msg266961
2016-06-02 21:39:21takluyversetmessages: + msg266960
2016-06-02 21:14:22orsenthilsetmessages: + msg266955
2016-06-02 21:12:53Dave Sawyersetmessages: + msg266954
2016-06-02 21:09:49Dave Sawyersetfiles: + issue27113-rev3.patch

messages: + msg266952
2016-05-31 03:06:14orsenthilsetfiles: + issue27113-rev2.patch

messages: + msg266731
2016-05-30 06:10:27Dave Sawyersetmessages: + msg266679
2016-05-30 03:51:10orsenthilsetfiles: + issue27113-rev2.patch

messages: + msg266672
2016-05-30 02:22:02orsenthilsetfiles: + issue27113-rev1.patch
nosy: + orsenthil
messages: + msg266664

2016-05-25 14:10:18takluyversetnosy: + takluyver
2016-05-25 13:41:51berker.peksagsetstage: needs patch -> patch review
2016-05-25 13:41:01berker.peksaglinkissue16509 superseder
2016-05-25 13:39:14berker.peksagsetversions: + Python 3.5
nosy: + berker.peksag

messages: + msg266352

stage: needs patch
2016-05-24 21:38:24Dave Sawyersettitle: sqlite3 open parameter "check_same_thread" not documented -> sqlite3 connect parameter "check_same_thread" not documented
2016-05-24 21:36:29Dave Sawyercreate