This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author petri.lehtinen
Recipients petri.lehtinen, pitrou
Date 2012-02-01.20:57:36
SpamBayes Score 1.153555e-11
Marked as misclassified No
Message-id <1328129858.32.0.164852943165.issue13921@psf.upfronthosting.co.za>
In-reply-to
Content
Connection.text_factory can be used to control what objects are
returned for the TEXT data type. An excerpt from the docs:

    For efficiency reasons, there’s also a way to return str
    objects only for non-ASCII data, and bytes otherwise. To
    activate it, set this attribute to sqlite3.OptimizedUnicode.

However, it always returns Unicode strings now. There's even a
test for this feature which is obviously wrong:

    def CheckOptimizedUnicode(self):
        self.con.text_factory = sqlite.OptimizedUnicode
        austria = "Österreich"
        germany = "Deutchland"
        a_row = self.con.execute("select ?", (austria,)).fetchone()
        d_row = self.con.execute("select ?", (germany,)).fetchone()
        self.assertTrue(type(a_row[0]) == str, "type of non-ASCII row must be str")
        self.assertTrue(type(d_row[0]) == str, "type of ASCII-only row must be str")

It checks for str in both cases even though it should test for
bytes in the latter case.

---

The user can get bytes if he wants to by saying so explicitly.
Having the library mix bytes and unicode by itself makes it
harder for the user. Furthermore, I don't really buy
the "efficiency" reason here, so I'd vote for removing the whole
OptimizeUnicode thing. It has never worked for Py3k so it would
be safe.
History
Date User Action Args
2012-02-01 20:57:38petri.lehtinensetrecipients: + petri.lehtinen, pitrou
2012-02-01 20:57:38petri.lehtinensetmessageid: <1328129858.32.0.164852943165.issue13921@psf.upfronthosting.co.za>
2012-02-01 20:57:37petri.lehtinenlinkissue13921 messages
2012-02-01 20:57:36petri.lehtinencreate