classification
Title: [sqlite3] cleanup and harden Connection and Cursor __init__
Type: Stage: patch review
Components: Extension Modules Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: DiddiLeija, corona10, erlendaasland, miss-islington, petr.viktorin
Priority: normal Keywords: patch

Created on 2021-09-07 11:46 by erlendaasland, last changed 2021-11-23 13:11 by petr.viktorin.

Pull Requests
URL Status Linked Edit
PR 28227 merged erlendaasland, 2021-09-07 21:21
PR 28231 merged erlendaasland, 2021-09-08 10:20
PR 28234 closed erlendaasland, 2021-09-08 10:38
PR 28298 merged erlendaasland, 2021-09-12 15:54
PR 28302 merged miss-islington, 2021-09-13 03:16
PR 29160 closed erlendaasland, 2021-10-22 13:07
Messages (18)
msg401248 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-09-07 11:46
Quoting Petr Viktorin in PR 27940, https://github.com/python/cpython/pull/27940#discussion_r703424148:

- db is not set to NULL if init fails.
- This segfaults for me:

  import sqlite3

  conn = sqlite3.connect(':memory:')
  conn.execute('CREATE TABLE foo (bar)')

  try:
      conn.__init__('/bad-file/')
  except sqlite3.OperationalError:
      pass

  conn.execute('INSERT INTO foo (bar) VALUES (1), (2), (3), (4)')


Other issues:
- reinitialisation is not handled gracefully
- __init__ is messy; members are initialised here and there


Suggested to reorder connection __init__ in logical groups to more easily handle errors:
  1. handle reinit
  2. open and configure database
  3. create statement LRU cache and weak ref cursor list
  4. initialise members in the order they're defined in connection.h
  5. set isolation level, since it's a weird case anyway
msg401352 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-09-08 07:45
This is a minefield. If anyone has a use case for it, I'd *love* to hear it, but to me it seems that proper reinit support will be a lot of work (now and in future maintenance) for no gain. You can always create a new connection object.

Consider instead deprecating reinitialization, documenting it as unpredictable & removing it in Python 3.13.


Here's another thing to test:
```python
import sqlite3

conn = sqlite3.connect(':memory:')
cursor = conn.execute('CREATE TABLE foo (bar)')

try:
    conn.__init__('/bad-file/')
except sqlite3.OperationalError:
    pass

cursor.execute('INSERT INTO foo (bar) VALUES (1), (2), (3), (4)')
```

And here's the kind of behavior that should be specified (and tested), and might even change to make more sense:
```python
import sqlite3

conn = sqlite3.connect(':memory:')
conn.text_factory=bytes
conn.row_factory = sqlite3.Row
cursor = conn.execute('CREATE TABLE foo (bar)')
cursor.execute('INSERT INTO foo (bar) VALUES ("1"), ("2"), ("3"), ("4")')
cursor.execute('SELECT bar FROM foo')

for row in cursor:
    print(row, list(row))
    break

conn.__init__(':memory:')
conn.execute('CREATE TABLE foo (bar)')
conn.execute('INSERT INTO foo (bar) VALUES ("a"), ("b"), ("c"), ("d")')

# Currently this uses the old database, old row_factory, but new text_factory"
for row in cursor:
    print(row, list(row))
```

We should also do a review of all places `connection` is used from Cursor, Statement, etc. to ensure everything continue to work with a different `db`. This can be done (and maybe you already did), but I'm also worried how well future maintainers can keep reinitialization in mind.

And all this will also need to be done for Cursor.
msg401356 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-09-08 08:29
I modified your second example slightly:

```
import sqlite3

conn = sqlite3.connect(":memory:")
conn.text_factory=bytes
conn.row_factory = sqlite3.Row
cursor = conn.execute("CREATE TABLE foo (bar)")
numbers = range(4)
cursor.executemany("INSERT INTO foo (bar) VALUES (?)", ((str(v),) for v in numbers))
cursor.execute("SELECT bar FROM foo")

print("first fetch")
for row in cursor.fetchmany(2):
    print(type(row[0]))

conn.__init__(":memory:")
conn.execute("CREATE TABLE foo (bar)")
letters = "a", "b", "c", "d"
conn.executemany("INSERT INTO foo (bar) VALUES (?)", ((v,) for v in letters))

# Currently this uses the old database, old row_factory, but new text_factory"
print("second fetch")
for row in cursor.fetchall():
    print(type(row[0]))
```

Here's the output:
first fetch
<class 'bytes'>
<class 'bytes'>
second fetch
<class 'str'>
<class 'str'>
msg401357 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-09-08 08:33
Note: I ran that with PR 28227.


OTOH: I do agree that there's a lot of pitfalls here, especially in the future. In the long run, it is probably best to deprecate reinit, and disable it in Python 3.13.
msg401358 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-09-08 08:36
Modifying the loops to also print the values:

first fetch
b'0' <class 'bytes'>
b'1' <class 'bytes'>
second fetch
2 <class 'str'>
3 <class 'str'>
msg401359 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-09-08 09:06
FYI, I've expanded the reinit tests and added a deprecation warning to the PR.
msg401361 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-09-08 09:26
I think a deprecation should be discussed a bit more widely than on bpo, so I opened a thread here: https://discuss.python.org/t/deprecating-sqlite-object-reinitialization/10503
msg401362 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-09-08 09:26
Great, thanks.
msg401363 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-09-08 09:31
As for the effort to fix this:

If we deprecate this, there should be no new users of it in 3.11+.

If we deprecate and also fix this, and we happen to introduce bugs or behavior changes, then people that use it now will need to:
1) adapt to the new behavior
2) find a different way to do things (before 3.13)

If we deprecate but keep the buggy behavior it as it is, (1) is not needed. Less work for both us and the users.
msg401366 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-09-08 10:00
> If we deprecate but keep the buggy behavior it as it is, (1) is not needed. Less work for both us and the users.

Indeed.

There's still a ref leak I'd like to take care of, though: if the first audit fails, database_obj leaks.
msg401371 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-09-08 10:42
I'll save the cleanup till Python 3.13 dev is started. I've opened a PR for fixing the ref leak (should be backported), and a draft PR for deprecating Connection and Cursor reinitialisation.
msg401561 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-09-10 08:08
Updated title to include sqlite3.Cursor as well, since cursors and connections are very much entwined.
msg401666 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2021-09-12 12:28
New changeset c78d5ca3806d02e26f9f3fa92ff567f0805eac4c by Erlend Egeberg Aasland in branch 'main':
bpo-45126: Fix ref. leak in `sqlite3.Connection.__init__` (GH-28231)
https://github.com/python/cpython/commit/c78d5ca3806d02e26f9f3fa92ff567f0805eac4c
msg401682 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2021-09-13 03:16
New changeset aa6dd54d43dffbdf883c083e361f6ccf8642d66e by Erlend Egeberg Aasland in branch '3.10':
[3.10] bpo-45126: Fix ref. leak in `sqlite3.Connection.__init__` (GH-28231). (GH-28298)
https://github.com/python/cpython/commit/aa6dd54d43dffbdf883c083e361f6ccf8642d66e
msg401684 - (view) Author: miss-islington (miss-islington) Date: 2021-09-13 04:07
New changeset 5d28bb699a305135a220a97ac52e90d9344a3004 by Miss Islington (bot) in branch '3.9':
[3.10] bpo-45126: Fix ref. leak in `sqlite3.Connection.__init__` (GH-28231). (GH-28298)
https://github.com/python/cpython/commit/5d28bb699a305135a220a97ac52e90d9344a3004
msg406406 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-11-16 14:53
New changeset 9d6215a54c177a5e359c37ecd1c50b594b194f41 by Erlend Egeberg Aasland in branch 'main':
bpo-45126: Harden `sqlite3` connection initialisation (GH-28227)
https://github.com/python/cpython/commit/9d6215a54c177a5e359c37ecd1c50b594b194f41
msg406419 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-11-16 17:30
The remaining PR (GH 28234) deprecates reinitialisation. I'm not sure it is a good idea to pursue this, so I'm leaning towards closing this as fixed, and also closing GH 28234.
msg406838 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2021-11-23 13:11
I think it's a good idea, but without the "will be disallowed in Python 3.13" part -- we should tell people that it's discouraged, but there's not much point in removing it.

But there's no consensus whether that's a good way to handle things, in general. So I'll leave it up to you.
History
Date User Action Args
2021-11-23 13:11:11petr.viktorinsetstatus: pending -> open

messages: + msg406838
2021-11-17 08:54:50erlendaaslandsetstatus: open -> pending
2021-11-16 17:30:06erlendaaslandsetmessages: + msg406419
2021-11-16 14:53:43petr.viktorinsetmessages: + msg406406
2021-10-22 13:07:30erlendaaslandsetpull_requests: + pull_request27435
2021-10-13 19:14:15DiddiLeijasetnosy: + DiddiLeija
2021-09-13 04:07:43miss-islingtonsetmessages: + msg401684
2021-09-13 03:16:17corona10setmessages: + msg401682
2021-09-13 03:16:12miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request26716
2021-09-12 15:54:56erlendaaslandsetpull_requests: + pull_request26712
2021-09-12 12:28:00corona10setnosy: + corona10
messages: + msg401666
2021-09-10 08:08:42erlendaaslandsetmessages: + msg401561
title: [sqlite3] cleanup and harden connection init -> [sqlite3] cleanup and harden Connection and Cursor __init__
2021-09-08 10:42:49erlendaaslandsetmessages: + msg401371
2021-09-08 10:38:46erlendaaslandsetpull_requests: + pull_request26654
2021-09-08 10:20:06erlendaaslandsetpull_requests: + pull_request26651
2021-09-08 10:00:53erlendaaslandsetmessages: + msg401366
2021-09-08 09:31:56petr.viktorinsetmessages: + msg401363
2021-09-08 09:26:52erlendaaslandsetmessages: + msg401362
2021-09-08 09:26:17petr.viktorinsetmessages: + msg401361
2021-09-08 09:06:15erlendaaslandsetmessages: + msg401359
2021-09-08 08:36:07erlendaaslandsetmessages: + msg401358
2021-09-08 08:33:48erlendaaslandsetmessages: + msg401357
2021-09-08 08:29:45erlendaaslandsetmessages: + msg401356
2021-09-08 07:45:28petr.viktorinsetmessages: + msg401352
2021-09-07 21:21:45erlendaaslandsetkeywords: + patch
stage: patch review
pull_requests: + pull_request26648
2021-09-07 11:46:02erlendaaslandcreate