classification
Title: sqlite3: OptimizedUnicode obsolete in Py3k
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: eric.araujo, ghaering, petri.lehtinen, pitrou, python-dev
Priority: normal Keywords: patch

Created on 2012-02-01 20:57 by petri.lehtinen, last changed 2012-02-09 19:13 by petri.lehtinen. This issue is now closed.

Files
File name Uploaded Description Edit
issue13921.patch petri.lehtinen, 2012-02-02 16:12 review
Messages (10)
msg152441 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2012-02-01 20:57
Connection.text_factory can be used to control what objects are
returned for the TEXT data type. An excerpt from the docs:

    For efficiency reasons, there’s also a way to return str
    objects only for non-ASCII data, and bytes otherwise. To
    activate it, set this attribute to sqlite3.OptimizedUnicode.

However, it always returns Unicode strings now. There's even a
test for this feature which is obviously wrong:

    def CheckOptimizedUnicode(self):
        self.con.text_factory = sqlite.OptimizedUnicode
        austria = "Österreich"
        germany = "Deutchland"
        a_row = self.con.execute("select ?", (austria,)).fetchone()
        d_row = self.con.execute("select ?", (germany,)).fetchone()
        self.assertTrue(type(a_row[0]) == str, "type of non-ASCII row must be str")
        self.assertTrue(type(d_row[0]) == str, "type of ASCII-only row must be str")

It checks for str in both cases even though it should test for
bytes in the latter case.

---

The user can get bytes if he wants to by saying so explicitly.
Having the library mix bytes and unicode by itself makes it
harder for the user. Furthermore, I don't really buy
the "efficiency" reason here, so I'd vote for removing the whole
OptimizeUnicode thing. It has never worked for Py3k so it would
be safe.
msg152442 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-02-01 21:02
+1 for removing, it makes no sense under Python 3.
msg152464 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2012-02-02 16:11
Attached a patch. It changes OptimizedUnicode to be an alias for PyUnicode_Type and adds a note to the documentation for porters from 2.x that it has no effect on py3k.

The patch removes/refactors all OptimizedUnicode and allow_8bit_chars related obsolete code that had been left over from py3k transition. These removals/refactorizations have no operational effect, so the module still works the same way it has always worked in Py3k.

Should OptimizedUnicode be deprecated, too? In this case, it cannot be aliased to str, and _pysqlite_fetch_one_row() needs to raise a DeprecationWarning if OptimizedUnicode is used.
msg152465 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-02-02 16:23
> Should OptimizedUnicode be deprecated, too?

I'd say just undocument it.
msg152466 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2012-02-02 16:43
> > Should OptimizedUnicode be deprecated, too?
> 
> I'd say just undocument it.

Even remove the note from the patch?
msg152468 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-02-02 16:51
Le jeudi 02 février 2012 à 16:43 +0000, Petri Lehtinen a écrit :
> Petri Lehtinen <petri@digip.org> added the comment:
> 
> > > Should OptimizedUnicode be deprecated, too?
> > 
> > I'd say just undocument it.
> 
> Even remove the note from the patch?

Well, I guess keeping the note is fine.
msg152602 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-02-04 08:19
I’m not sure the doc note is useful, but didn’t code search to confirm it.

Also, 3.2 may be out of bounds for this cleanup (I don’t know the rules for what can be committed in what branches these days).
msg152688 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2012-02-05 13:53
Éric Araujo wrote:
> I’m not sure the doc note is useful, but didn’t code search to
> confirm it.

Yeah. Perhaps it would be better as a comment in the code.

> Also, 3.2 may be out of bounds for this cleanup (I don’t know the
> rules for what can be committed in what branches these days).

My intention was to apply it to 3.3 only.
msg152975 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012-02-09 19:11
New changeset 0fc10a33eb4c by Petri Lehtinen in branch 'default':
Undocument and clean up sqlite3.OptimizedUnicode
http://hg.python.org/cpython/rev/0fc10a33eb4c
msg152976 - (view) Author: Petri Lehtinen (petri.lehtinen) * (Python committer) Date: 2012-02-09 19:13
Committed the patch after moving the documentation note to a source code comment instead. Thanks for reviews.
History
Date User Action Args
2012-02-09 19:13:15petri.lehtinensetkeywords: - needs review

messages: + msg152976
2012-02-09 19:11:08python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg152975

resolution: fixed
stage: patch review -> resolved
2012-02-05 13:53:24petri.lehtinensetmessages: + msg152688
2012-02-04 08:19:24eric.araujosetnosy: + eric.araujo

messages: + msg152602
title: sqlite3: OptimizedUnicode doesn't work in Py3k -> sqlite3: OptimizedUnicode obsolete in Py3k
2012-02-02 16:51:47pitrousetmessages: + msg152468
2012-02-02 16:43:38petri.lehtinensetmessages: + msg152466
2012-02-02 16:23:01pitrousetmessages: + msg152465
2012-02-02 16:12:47petri.lehtinensetkeywords: + needs review
stage: patch review
2012-02-02 16:12:16petri.lehtinensetfiles: + issue13921.patch
keywords: + patch
2012-02-02 16:11:56petri.lehtinensetmessages: + msg152464
2012-02-01 21:02:36pitrousetnosy: + ghaering
messages: + msg152442
2012-02-01 20:57:37petri.lehtinencreate