This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Ingo Ruhnke
Recipients Ingo Ruhnke
Date 2016-12-20.09:24:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1482225891.27.0.11020777474.issue29021@psf.upfronthosting.co.za>
In-reply-to
Content
When a sqlite database contains invalid UTF-8 code in a TEXT column, Python can query that data normally when .text_factory is set appropriately. However when a custom function is created with .create_function() and applied to that column the custom function will receive 'None' as argument instead of the value of the column.

The following example demonstrate the issue:

Example:
--------

import sqlite3
import sys
import os

con = sqlite3.connect(":memory:")
con.text_factory = os.fsdecode

con.create_function("py_identity", 1, lambda x: x)

cur = con.cursor()
cur.execute("CREATE TABLE foo(bar TEXT)")

# insert some invalid UTF-8 into the database
cur.execute("INSERT INTO foo(bar) VALUES(cast(? AS TEXT))", [b"\xff"])

# try to call a custom function on the invalid UTF-8
cur.execute("SELECT "
            "  typeof(bar), "
            "  bar, " # this works
            "  py_identity(bar), " # this returns None instead of the content of 'bar'
            "  cast(py_identity(cast(bar as BLOB)) AS TEXT) " # this works around the issue
            "FROM foo")

for row in cur:
    print(row)

Output:
-------

('text', '\udcff', None, '\udcff')


Expected:
---------

('text', '\udcff', '\udcff', '\udcff')
History
Date User Action Args
2016-12-20 09:24:51Ingo Ruhnkesetrecipients: + Ingo Ruhnke
2016-12-20 09:24:51Ingo Ruhnkesetmessageid: <1482225891.27.0.11020777474.issue29021@psf.upfronthosting.co.za>
2016-12-20 09:24:51Ingo Ruhnkelinkissue29021 messages
2016-12-20 09:24:50Ingo Ruhnkecreate