Issue 39170: Sqlite3 row_factory for attribute access: NamedRow

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/83351

classification

Title:	Sqlite3 row_factory for attribute access: NamedRow
Type:	enhancement	Stage:	patch review
Components:	Library (Lib)	Versions:	Python 3.9

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	erlendaasland, jidn
Priority:	normal	Keywords:	patch

Created on 2019-12-31 06:57 by jidn, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL	Status	Linked	Edit
PR 17768	open	jidn, 2019-12-31 07:17

Messages (2)
msg359104 - (view)	Author: Clinton James (jidn) *	Date: 2019-12-31 06:57
Currently, sqlite3 returns rows by tuple or sqlite3.Row for dict-style, index access. I constantly find myself wanting attribute access like namedtuple for rows. I find attribute access cleaner without the brackets and quoting field names. However, unlike previous discussions (https://bugs.python.org/issue13299), I don't want to use the namedtuple object. I appreciate the simple API and minimal memory consumption of sqlite3.Row and used it as my guide in creating sqlite3.NamedRow to allow access by index and attribute. A pull request is ready Why a new object instead of adding attribute access to the existing sqlite3.Row? There is an existing member method `keys` and any table with the field "keys" would cause a hard to debug, easily avoidable, collision. Features: + Optimized in C, so it will be faster than any python implementation. + Access columns by attribute for all valid names and by index for all names. + Iterate over fields by name/value pairs. + Works with standard functions `len` and `contains`. + Identical memory consumption to sqlite3.Row with two references: the data tuple and the cursor description. + Identical speed to sqlite3.Row if not faster. Timing usually has it slightly faster for index by name or attribute, but it is almost identical. Examples: >>> import sqlite3 >>> c = sqlite3.Connection(":memory:").cursor() >>> c.row_factory = sqlite3.NamedRow >>> named_row = c.execute("SELECT 'A' AS letter, '.-' AS morse, 65 AS ord").fetchone() >>> len(named_row) 3 >>> 'letter' in named_row true >>> named_row == named_row true >>> hash(named_row) 5512444875192833987 Index by number and range. >>> named_row[0] 'A' >>> named_row[1:] ('.-', 65) Index by column name. >>> named_row["ord"] 65 Access by attribute. >>> named_row.morse '.-' Iterate row for name/value pairs. >>> dict(named_row) {'letter': 'A', 'morse': '.-', 'ord': 65} >>> tuple(named_row) (('letter', 'A'), ('morse', '.-'), ('ord', 65)) How sqlite3.NamedRow differs from sqlite3.Row ---------------------------------------------- The class only has class dunder methods to allow any valid field name. When the field name would be an invalid attribute name, you have two options: either use the SQL `AS` in the select statement or index by name. To get the field names use the iterator `[x[0] for x in row]` or do the same from the `cursor.description`. ```python titles = [x[0] for x in row] titles = [x[0] for x in cursor.description] titles = dict(row).keys() ``` Attribute and dict access are no longer case-insensitive. There are three reasons for this. 1. Case-insensitive comparison only works well for ASCII characters. In a Unicode world, case-insensitive edge cases create unnecessary errors. Looking at a several existing codebases, this feature of Row is almost never used and I believe is not needed in NamedRow. 2. Case-insensitivity is not allowed for attribute access. This "feature" would treat attribute access differently from the rest of Python and "special cases aren't special enough to break the rules". Where `row.name`, `row.Name`, and `row.NAME` are all the same it gives off the faint code smell of something wrong. When case-insensitively is needed and the query SELECT can not be modified, sqlite3.Row is still there. 3. Code is simpler and easier to maintain. 4. It is faster. Timing Results -------------- NamedRow is faster than sqlite3.Row for index-by-name access. I have published a graph and the methodology of my testing. In the worst-case scenario, it is just as fast as sqlite3.Row without any extra memory. In most cases, it is faster. For more information, see the post at http://jidn.com/2019/10/namedrow-better-python-sqlite3-row-factory/
msg394911 - (view)	Author: Erlend E. Aasland (erlendaasland) *	Date: 2021-06-02 14:17
See also bpo-13299

History
Date	User	Action	Args
2022-04-11 14:59:24	admin	set	github: 83351
2021-06-02 14:17:40	erlendaasland	set	messages: + msg394911
2020-05-24 21:58:28	erlendaasland	set	nosy: + erlendaasland
2019-12-31 07:17:04	jidn	set	keywords: + patch stage: patch review pull_requests: + pull_request17204
2019-12-31 06:57:54	jidn	create