classification
Title: Sqlite3 row_factory for attribute access: NamedRow
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: erlendaasland, jidn
Priority: normal Keywords: patch

Created on 2019-12-31 06:57 by jidn, last changed 2021-06-02 14:17 by erlendaasland.

Pull Requests
URL Status Linked Edit
PR 17768 open jidn, 2019-12-31 07:17
Messages (2)
msg359104 - (view) Author: Clinton James (jidn) * Date: 2019-12-31 06:57
Currently, sqlite3 returns rows by tuple or sqlite3.Row for dict-style, index access.  I constantly find myself wanting attribute access like namedtuple for rows.  I find attribute access cleaner
without the brackets and quoting field names.  However, unlike previous discussions (https://bugs.python.org/issue13299), I don't want to use the namedtuple object.  I appreciate the simple API and minimal memory consumption of sqlite3.Row and used it as my guide in creating sqlite3.NamedRow to allow access by index and attribute.

A pull request is ready

Why a new object instead of adding attribute access to the existing sqlite3.Row?
There is an existing member method `keys` and any table with the field "keys" would cause a hard to debug, easily avoidable, collision.

Features:

+ Optimized in C, so it will be faster than any python implementation.
+ Access columns by attribute for all valid names and by index for all names.
+ Iterate over fields by name/value pairs.
+ Works with standard functions `len` and `contains`.
+ Identical memory consumption to sqlite3.Row with two references: the data tuple and the cursor description.
+ Identical speed to sqlite3.Row if not faster.  Timing usually has it slightly faster for index by name or attribute, but it is almost identical.

Examples:

    >>> import sqlite3
    >>> c = sqlite3.Connection(":memory:").cursor()
    >>> c.row_factory = sqlite3.NamedRow
    >>> named_row = c.execute("SELECT 'A' AS letter, '.-' AS morse, 65 AS ord").fetchone()

    >>> len(named_row)
    3
    >>> 'letter' in named_row
    true
    >>> named_row == named_row
    true
    >>> hash(named_row)
    5512444875192833987

Index by number and range.
    >>> named_row[0]
    'A'
    >>> named_row[1:]
    ('.-', 65)

Index by column name.
    >>> named_row["ord"]
    65

Access by attribute.
    >>> named_row.morse
    '.-'

Iterate row for name/value pairs.
    >>> dict(named_row)
    {'letter': 'A', 'morse': '.-', 'ord': 65}
    >>> tuple(named_row)
    (('letter', 'A'), ('morse', '.-'), ('ord', 65))


How sqlite3.NamedRow differs from sqlite3.Row
----------------------------------------------

The class only has class dunder methods to allow any valid field name. When the field name would be an invalid attribute name, you have two options: either use the SQL `AS` in the select statement or index by name.
To get the field names use the iterator `[x[0] for x in row]` or do the same from the
`cursor.description`.

```python
    titles = [x[0] for x in row]
    titles = [x[0] for x in cursor.description]
    titles = dict(row).keys()
```

Attribute and dict access are no longer case-insensitive.  There are three reasons for this.
    1. Case-insensitive comparison only works well for ASCII characters.  In a Unicode world, case-insensitive edge cases create unnecessary errors. Looking at a several existing codebases,
       this feature of Row is almost never used and I believe is not needed in NamedRow.
    2. Case-insensitivity is not allowed for attribute access.  This "feature" would treat attribute access differently from the rest of Python and "special cases aren't special enough to break the rules". Where `row.name`, `row.Name`, and `row.NAME` are all the same it gives off the faint code smell of something wrong.  When case-insensitively is needed and the query SELECT can not be modified, sqlite3.Row is still there.
    3. Code is simpler and easier to maintain.
    4. It is faster.

Timing Results
--------------

NamedRow is faster than sqlite3.Row for index-by-name access.
I have published a graph and the methodology of my testing.  In the worst-case scenario, it is just as fast as sqlite3.Row without any extra memory.  In most cases, it is faster.
For more information, see the post at http://jidn.com/2019/10/namedrow-better-python-sqlite3-row-factory/
msg394911 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-06-02 14:17
See also bpo-13299
History
Date User Action Args
2021-06-02 14:17:40erlendaaslandsetmessages: + msg394911
2020-05-24 21:58:28erlendaaslandsetnosy: + erlendaasland
2019-12-31 07:17:04jidnsetkeywords: + patch
stage: patch review
pull_requests: + pull_request17204
2019-12-31 06:57:54jidncreate