Issue 43290: [sqlite3] remove legacy code from pysqlite_step

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/87456

classification

Title:	[sqlite3] remove legacy code from pysqlite_step
Type:	enhancement	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 3.10

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	berker.peksag, christian.heimes, corona10, erlendaasland, serhiy.storchaka
Priority:	normal	Keywords:	patch

Created on 2021-02-21 23:16 by erlendaasland, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 24638	merged	erlendaasland, 2021-02-24 11:17

Messages (18)
msg387476 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-21 23:16
pysqlite_step() contains a NULL check needed for SQLite 3.5 and earlier. This can be removed.
msg387619 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-24 11:08
From the SQLite 3.5.3 changelog: - sqlite3_step() returns SQLITE_MISUSE instead of crashing when called with a NULL parameter.
msg387623 - (view)	Author: Dong-hee Na (corona10) *	Date: 2021-02-24 15:02
Hmm by the way the current implementation returns SQLITE_OK if the statement is NULL, but it looks like return SQLITE_MISUSE if we apply this patch. Does it not cause any behavior regression? if so we should add news also.
msg387627 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-24 16:02
I’ll check all uses and see if we’ve got everything covered by the test suite. This is one of the core functions of the sqlite3 module (all queries call step at least once, but often multiple times), so I’d expect coverage is pretty good.
msg387639 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-24 21:09
There are six users of pysqlite_step(): $ grep -nrE "\<pysqlite_step\>" Modules/_sqlite Modules/_sqlite/util.c:27:int pysqlite_step(sqlite3_stmt* statement, pysqlite_Connection* connection) Modules/_sqlite/connection.c:393: rc = pysqlite_step(statement, self); Modules/_sqlite/connection.c:442: rc = pysqlite_step(statement, self); Modules/_sqlite/connection.c:493: rc = pysqlite_step(statement, self); Modules/_sqlite/util.h:32:int pysqlite_step(sqlite3_stmt* statement, pysqlite_Connection* connection); Modules/_sqlite/cursor.c:519: rc = pysqlite_step(self->statement->st, self->connection); Modules/_sqlite/cursor.c:715: rc = pysqlite_step(statement, self->connection); Modules/_sqlite/cursor.c:787: rc = pysqlite_step(self->statement->st, self->connection); The three users in Modules/_sqlite/connection.c — _pysqlite_connection_begin(), pysqlite_connection_commit_impl(), and pysqlite_connection_rollback_impl() – are all ok, following this pattern: 1) prepare the statement 2) verify that prepare was successful, bail if not 3) call step pysqlite_cursor_executescript() (line 715 in Modules/_sqlite/cursor.c) is also ok: 1) prepare the statement 2) verify that prepare was successful, bail if not 3) call step until there are no more rows I need a little bit more time to verify _pysqlite_query_execute() and pysqlite_cursor_iternext(). Note to self: pysqlite_cursor_executescript() calls sqlite3_finalize() three times. It would have been better to break out of the loop when rc != SQLITE_ROW, immediately call sqlite3_finalize() (error code is preserved if sqlite3_step() failed), and then check rc and PyErr_Occurred(). It will make the code easier to follow, IMO.
msg387641 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-24 22:52
pysqlite_cursor_iternext() has four users: - sqlite3.Cursor.fetchone() - sqlite3.Cursor.fetchall() - sqlite3.Cursor.fetchmany() - sqlite3.Cursor.__next__() All of these methods pass self to pysqlite_cursor_iternext(). pysqlite_cursor_iternext() starts by checking the state of the cursor (is it initialised, it the database open, etc.). If there is a Statement object, pysqlite_step() is called with the sqlite3_stmt pointer from the Statement object (Cursor->statement->st). The statement pointer of the Cursor object is set in _pysqlite_query_execute() – the last pysqlite_step() user – either from the LRU cache (line 470), or by creating a new Statement object (line 479). The latter only leaves a valid Cursor->statement->st pointer (sqlite3_stmt pointer) if the Statement object was successfully created, and the sqlite3_stmt successfully prepared. (I assume only valid Statement objects are added to the cache.) Before the main loop of _pysqlite_query_execute() starts, the statement is reset. In the loop, the next parameter set is fetched, the statement is (re)bound, and step is called. If Cursor.execute() called _pysqlite_query_execute(), the parameter list is initialised to a single-item list, and the loop is only run once. From what I can read, this function is also safe. (But it is very messy; for instance, if there's an active Statement, it is reset twice before the loop starts.) I tried forcing an error by using an uninitialised cursor: >>> cx = sqlite3.connect(":memory:") >>> cu = sqlite3.Cursor.__new__(sqlite3.Cursor) >>> sqlite3.Cursor.fetchone(cu) Traceback (most recent call last): File "<stdin>", line 1, in <module> sqlite3.ProgrammingError: Base Cursor.__init__ not called. >>> next(cu) Traceback (most recent call last): File "<stdin>", line 1, in <module> sqlite3.ProgrammingError: Base Cursor.__init__ not called. Ditto for fetchmany() and fetchall(). This is consistent with current behaviour. Calling fetch*() without first executing a statement: >>> cu = cx.cursor() >>> cu.fetchone() >>> cu.fetchmany() [] >>> cu.fetchall() [] >>> next(cu) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration This is consistent with current behaviour. I might have missed something, but from what I can see, there are no paths that lead to pysqlite_step() being called with a NULL pointer. Berker, Serhiy, please correct me if I'm wrong.
msg387643 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-24 23:23
Regarding test coverage: _pysqlite_query_execute() has one coverage gap: lines 478-488 (if (self->statement->in_use)) are never executed. Except from that, the only missing are the hard-to-trigger goto error's (for example PyList_New and PyList_Append failures). I'm not sure if it's possible to trigger this, but I cannot see that it is relevant for our case here. Coverage for the other users of pysqlite_step is as complete as it can be; only the standard corner error cases are missing.
msg387648 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-25 08:08
> _pysqlite_query_execute() has one coverage gap: lines 478-488 (if (self->statement->in_use)) are never executed. I wonder if it is possible at all to reach this branch. If it is not, then I'm pretty sure Cursor.in_use is redundant, and all code relating to it can be removed.
msg387657 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-25 11:12
> I wonder if it is possible at all to reach this branch. If it is not, then I'm pretty sure Cursor.in_use is redundant Typo: should be Statement.in_use Corner error cases may cause the _pysqlite_query_execute() loop to exit without pysqlite_statement_reset() being called, thus leaving Statement.in_use == 1, but when _pysqlite_query_execute() is called again and if there's an active statement, pysqlite_statement_reset() will reset in_use to zero, before we ever reach the if (self->statement->in_use) { ... } statement. I can open a separate issue for considering removing Statement.in_use. > (I assume only valid Statement objects are added to the cache.) AFAICS, this is true. We can wait for Berker and/or Serhiy to confirm/correct my assumptions and findings.
msg387662 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2021-02-25 12:07
Look at issue in which the workaround was added. Does it contain some examples?
msg387663 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-25 12:08
> Look at issue in which the workaround was added. Does it contain some examples? I'll check. Thanks.
msg387664 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-25 12:28
Introduced by Christian Heimes 2007-12-14 in commit 380532117c2547bb0dedf6f85efa66d18a9abb88, which is a merge from SVN (?) checkin r59471 by Gerhard Häring (https://mail.python.org/pipermail/python-checkins/2007-December/063968.html) This corresponds to https://github.com/ghaering/pysqlite/commit/61b3cd9381750bdd96f8f8fc2c1b19f1dc4acef7, with a simple test added two commits later, https://github.com/ghaering/pysqlite/commit/7b0faed4ababbf1053caa2f5b2efc15929f66c4f That test was added to CPython in 2008-03-29 by Gerhard in commit e7ea7451a84636655927da4b9731d2eb37d1cf34, and it's still here.
msg387665 - (view)	Author: Christian Heimes (christian.heimes) *	Date: 2021-02-25 12:32
Back in the day I was of several core devs that took care of syncing code between Python 2 and 3 branches with a tool called "svnmerge". Commit 380532117c2547bb0dedf6f85efa66d18a9abb88 is a svnmerge commit. The tool synced changesets in batches, which makes it harder to bisect changes.
msg387668 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-25 12:58
Ah, at last I found the source of the confusion: The SQLite changelog. Quoting from msg387619 and https://sqlite.org/changes.html: > From the SQLite 3.5.3 changelog: > - sqlite3_step() returns SQLITE_MISUSE instead of crashing when called with a NULL parameter. I assumed this was correct without even trying it. This short snippet shows something else: int rc = sqlite3_reset(NULL); printf("reset with NULL: %d %s\n", rc, sqlite3_errstr(rc)); $ ./a.out reset with NULL: 0 not an error Gerhard's comment was right and the workaround was right. I'll adjust the comment. Dong-hee Na: > Hmm by the way the current implementation returns SQLITE_OK if the statement is NULL, but it looks like return SQLITE_MISUSE if we apply this patch. > Does it not cause any behavior regression? if so we should add news also. Behaviour stays the same; no regressions introduced. I learned a lot about the sqlite3 module, and I relearned I should not trust changelogs/documentation without trying stuff myself first. I'll adjust the erroneous comment and re-request a review, Dong-hee.
msg387669 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-25 13:09
> int rc = sqlite3_reset(NULL); > printf("reset with NULL: %d %s\n", rc, sqlite3_errstr(rc)); Sorry, wrong test: int rc = sqlite3_step(NULL); printf("step with NULL: %d %s\n", rc, sqlite3_errstr(rc)); $ ./a.out step with NULL: 21 bad parameter or other API misuse
msg387670 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-25 13:22
msg387668 was a little bit hasty. I'll try again: Dong-hee Na: > Hmm by the way the current implementation returns SQLITE_OK if the statement is NULL, but it looks like return SQLITE_MISUSE if we apply this patch. > Does it not cause any behavior regression? if so we should add news also. No, behaviour still stays the same; no regressions introduced. The bug is triggered by executing an empty statement. This will pass an empty statement to pysqlite_step() in line 519 of Modules/_sqlite/cursor.c. Earlier, this returned SQLITE_OK. sqlite3_column_count(NULL) returns 0, so we slide through the rest of the loop without much happening. Now, pysqlite_step() returns SQLITE_MISUSE, which only results in the statement being reset in line 529, and the error cleared in line 530. Then we bail out of the loop. So, the current comment is correct, the SQLite changelog was correct, the workaround and old comment was wrong. I'm done :)
msg387688 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-02-25 21:43
Addendum to msg387641: > The latter only leaves a valid Cursor->statement->st pointer (sqlite3_stmt pointer) if the Statement object was successfully created, and the sqlite3_stmt successfully prepared. sqlite3_prepare_v2() actually returns SQLITE_OK but sets the statement pointer to NULL, if given an empty string or a comment. Only the sqlite3_prepare_v2() return code is checked in order to determine if pysqlite_statement_create() was successful or not.
msg387694 - (view)	Author: Berker Peksag (berker.peksag) *	Date: 2021-02-25 23:39
New changeset 91ea37c84af2dd5ea92802a4c2adad47861ac067 by Erlend Egeberg Aasland in branch 'master': bpo-43290: Remove workaround from pysqlite_step() (GH-24638) https://github.com/python/cpython/commit/91ea37c84af2dd5ea92802a4c2adad47861ac067

History
Date	User	Action	Args
2022-04-11 14:59:41	admin	set	github: 87456
2021-02-25 23:40:16	berker.peksag	set	status: open -> closed resolution: fixed stage: patch review -> resolved
2021-02-25 23:39:48	berker.peksag	set	messages: + msg387694
2021-02-25 21:43:57	erlendaasland	set	messages: + msg387688
2021-02-25 13:22:09	erlendaasland	set	messages: + msg387670
2021-02-25 13:09:19	erlendaasland	set	messages: + msg387669
2021-02-25 12:58:16	erlendaasland	set	messages: + msg387668
2021-02-25 12:32:39	christian.heimes	set	nosy: + christian.heimes messages: + msg387665
2021-02-25 12:28:23	erlendaasland	set	messages: + msg387664
2021-02-25 12:08:37	erlendaasland	set	messages: + msg387663
2021-02-25 12:07:53	serhiy.storchaka	set	messages: + msg387662
2021-02-25 11:12:54	erlendaasland	set	messages: + msg387657
2021-02-25 08:08:45	erlendaasland	set	messages: + msg387648
2021-02-24 23:23:25	erlendaasland	set	messages: + msg387643
2021-02-24 22:52:56	erlendaasland	set	messages: + msg387641
2021-02-24 21:09:34	erlendaasland	set	messages: + msg387639
2021-02-24 16:02:23	erlendaasland	set	messages: + msg387627
2021-02-24 15:02:47	corona10	set	nosy: + corona10 messages: + msg387623
2021-02-24 11:17:42	erlendaasland	set	keywords: + patch stage: patch review pull_requests: + pull_request23423
2021-02-24 11:08:07	erlendaasland	set	messages: + msg387619
2021-02-21 23:16:09	erlendaasland	create