This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: [sqlite3] proposal: add sqlite3.Cursor.insert() method
Type: enhancement Stage: resolved
Components: Extension Modules Versions: Python 3.9
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: AR, erlendaasland
Priority: normal Keywords:

Created on 2021-08-16 18:01 by AR, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (4)
msg399663 - (view) Author: AR (AR) Date: 2021-08-16 18:00
I propose to add an insert method to a SQLite cursor object, believe this could greatly improve readability of python code when you have to do a lot of insert operations to a different tables. 

Currently we have to create a dictionary with SQL's for each table:
insert_sql = {'form_a': """INSERT INTO table_a (field_a1, field_a2) VALUES(:f1, :f2)""",
              'form_b': """INSERT INTO table_b (field_b1, field_b2, field_b3) VALUES(:f1, :f2, :f3),
               ....other SQL statements }"""
or we may use version with unnamed parameters:
insert_sql = {'form_a': """INSERT INTO table_a (field_a1, field_a2) VALUES(?, ?)""",
              'form_b': """INSERT INTO table_b (field_b1, field_b2, field_b3) VALUES(?, ?, ?),
               ....other SQL statements }"""
The first one is conveniently compatible with cx_Oracle bind variable syntax, rows that are inserted are essentially dictionary. As dictionary is a mutable type, some extra freedom during construction of the row is allowed.

The second syntax(qmark style) is specific to SQLite, rows that are inserted should have tuple type (namedtuple for us to be able to extract field names at later stage).
entries_dict = [{'field_a1': 100, 'field_a2': 'sample1'},
                {'field_a1': 500, 'field_a2': 'sample2'}]
DataRow = namedtuple('DataRow', ['field_a1', 'field_a2'])
entries_namedtuple = [DataRow(101, 'sample3'), DataRow(505, 'sample4')]

In order to do an insert, you have to use either execute, or executemany:
cursor.executemany(insert_sql['form_a'], entries_dict)
cursor.execute(insert_sql['form_a'], entries_dict[0])

Now let's move towards future insert method of cursor. As a first step, lets create SQL statement on the fly:
table_name = 'table_a'

#in case of a list of dictionaries:
sql = """INSERT INTO {} ({}) VALUES({})""".format(table_name, ', '.join([str(key) for key in entries_dict[0]]),
                                    ', '.join([':' + str(key) for key in entries_dict[0]]))
#currently, to do an insert operation, we have to:
cursor.executemany(sql, entries_dict)

#in case of a list of namedtuples:
sql = """INSERT INTO {} ({}) VALUES({})""".format(table_name, ', '.join([str(field) for field in entries_namedtuple[0]._fields]),
                                    ', '.join(['?' for field in entries_namedtuple[0]._fields]))
#currently, to do an insert operation, we have to:
cursor.executemany(sql, entries_namedtuple)

Now back to the proposal of insert method with unified syntax (one/many and dict/namedtuple). Let's do a second step and add an Insert method to a Cursor. 
The idea is to provide this method with table name, extract column names from supplied dict/namedtuple and use SQL generators from above. Than we could replace old cursor.executemany syntax with:

cursor.insert(table_name, entries_dict)
cursor.insert(table_name, entries_dict[0])
cursor.insert(table_name, entries_tuple)

Since we may insert all, or any row of two types, this looks even more pythonic than pymongo(MongoDB) approach:
Actually, the fact that pymongo insert code is so much cleaner and concise drew my attention. Other aspects of that lib are totally different story.

I do not propose to generalize, or to move towards ORM or pymongo way of doing things. The scope is limited - lets do a convenient insert.

Simplified implementation could be like this:

    def insert(self, table_name, entries):
        if(type(entries) == list):             # several records(rows) need to be inserted
            do_insert = self.executemany
            if(hasattr(entries[0], '_fields')): #NamedTuple
                sql = "INSERT INTO {} ({}) VALUES({})".format(table_name, ', '.join([str(field) for field in entries[0]._fields]),
                                        ', '.join(['?' for field in entries[0]._fields]))
            elif(type(entries[0] == dict)):     #dict
                sql = "INSERT INTO {} ({}) VALUES({})".format(table_name, ', '.join([str(key) for key in entries[0]]),
                                        ', '.join([':' + str(key) for key in entries[0]]))
        else:                                   #just one record(row)
            do_insert = self.execute
            if(hasattr(entries, '_fields')):    #NamedTuple
                sql = "INSERT INTO {} ({}) VALUES({})".format(table_name, ', '.join([str(field) for field in entries._fields]),
                                        ', '.join(['?' for field in entries._fields]))
            elif(type(entries == dict)):
                sql = "INSERT INTO {} ({}) VALUES({})".format(table_name, ', '.join([str(key) for key in entries]),
                                        ', '.join([':' + str(key) for key in entries]))
        do_insert(sql, entries)    

If proposal is not feasible/doesn’t fit to a broad concept, I suggest to mention in documentation
- list comprehension one-line SQL-generators (see above)
- remind users who create list of dictionaries for bulk insert that a copy of the dict should be used. Otherwise all dicts inside a list would be the same
entries_dict.append(entry_dict.copy()). Definitely, as namedtuple is immutable, no need for extra steps for a list of namedtuples.
msg399670 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-08-16 19:04
The INSERT statement comes in many varieties. What about INSERT OR [ABORT,FAIL,IGNORE,REPLACE,ROLLBACK], INSERT WITH RECURSIVE, INSERT INTO table SELECT, etc.? Why only add such an API for insert; why not replace() as well?

I'm not convinced this is a good idea. It sounds to me like a project specific need that would need a project specific implementation. And as you showed, the implementation is pretty straight-forward.
msg399673 - (view) Author: AR (AR) Date: 2021-08-16 19:36
There is no good solution to this. Either a move to all-in-one solution and ORM at the end, or limited scope hack that allow for a convenient "just insert". I do not insist due to this controversy

I just didn't realize the difference until I've noticed that one line of my pymongo code corresponds to many lines of sqlite code. I'm perfectly fine using direct SQL statements, but would happily use a shortcut.

It's up to you to decide as you see a broader picture. Thank you in any case!
msg404293 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-10-19 12:31
I'm closing this for now. As I see it, there is little added value, and it would be hard to make a good API for all invariants. If this idea surfaces again, we can always reopen this issue.

Thanks for you interest in improving CPython :)
Date User Action Args
2022-04-11 14:59:48adminsetgithub: 89090
2021-10-19 12:31:11erlendaaslandsetstatus: open -> closed
resolution: rejected
messages: + msg404293

stage: resolved
2021-08-16 19:36:23ARsetmessages: + msg399673
2021-08-16 19:30:34erlendaaslandsettitle: [sqlite3] insert -> [sqlite3] proposal: add sqlite3.Cursor.insert() method
2021-08-16 19:04:54erlendaaslandsetmessages: + msg399670
2021-08-16 18:15:32erlendaaslandsetnosy: + erlendaasland
2021-08-16 18:01:00ARcreate