classification
Title: sqlite3 docs should mention utf8 requirement
Type: enhancement Stage:
Components: Documentation Versions: Python 2.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ghaering Nosy List: akuchling, amaury.forgeotdarc, asmodai, georg.brandl, ghaering, resolve1, vdupras
Priority: low Keywords: easy, patch

Created on 2008-02-16 14:08 by resolve1, last changed 2009-04-25 15:01 by georg.brandl. This issue is now closed.

Files
File name Uploaded Description Edit
sqlite_connect_test.diff vdupras, 2008-02-28 15:56
sqlite_connect2.diff amaury.forgeotdarc, 2008-03-04 09:21 encode connection string with utf-8
Messages (15)
msg62456 - (view) Author: Damien Elmes (resolve1) Date: 2008-02-16 14:08
The docs on http://docs.python.org/lib/sqlite3-Module-Contents.html
should mention that the connection string should always be UTF-8,
regardless of the encoding system of the underlying filesystem. See the
'note to windows users' on 

http://www.sqlite.org/c3ref/open.html
msg62558 - (view) Author: Virgil Dupras (vdupras) (Python triager) Date: 2008-02-19 11:53
+1 I've been pulling my hair off over this one too. Try this on win32:

Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.mkdir(u'foo\xe9')
>>> import sqlite3
>>> con = sqlite3.connect(u'foo\xe9\\my.db')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 3:
ordinal not in range(128)
>>> import sys
>>> sys.getfilesystemencoding()
'mbcs'
>>> con = sqlite3.connect(u'foo\xe9\\my.db'.encode(sys.getfilesystemencoding()))

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
sqlite3.OperationalError: unable to open database file
>>> con = sqlite3.connect(u'foo\xe9\\my.db'.encode('utf-8'))
>>>
msg62611 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-02-21 09:16
IMO, connect() should accept unicode strings, and encode them to utf-8
when calling the C function.
Patch attached.
msg62639 - (view) Author: Virgil Dupras (vdupras) (Python triager) Date: 2008-02-21 18:45
Shouldn't we apply this patch directly on pysqlite? Any change made to 
the sqlite3 module will be overwritten in the next "refresh", right? 
Anyway, I'm not 100% sure, but it might already be fixed:

http://www.initd.org/tracker/pysqlite/changeset/452

So, maybe create a ticket to use the latest version of pysqlite?
msg62663 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-02-21 21:42
> So, maybe create a ticket to use the latest version of pysqlite?
I don't think so. While the change you are pointing to does deal with
the same problem, the base versions seems to have substantial differences.
I still prefer my patch.
msg63078 - (view) Author: Gerhard Häring (ghaering) * (Python committer) Date: 2008-02-27 20:10
I'll assign this one to me. The "sqlite3" module cannot be just
"refreshed" with the externally maintained pysqlite, I'll have to do
merging anyway. Don't worry here ;-)
msg63095 - (view) Author: Virgil Dupras (vdupras) (Python triager) Date: 2008-02-28 15:56
Ok then, we need a test for this. Patch attached.

However, I don't know if I applied Amaury's patch wrong or if I miss a 
./configure option or something, but even with the patch, the test fails.

Another thing: Why isn't the sqlite3 test suite a part of python's 
regression test suite? Or is it and I didn't notice?
msg63111 - (view) Author: Gerhard Häring (ghaering) * (Python committer) Date: 2008-02-28 22:55
I didn't try the patch out, yet. But I'd instead try to just open
u":memory" instead.

Also, in Lib/test/test_sqlite.py the sqlite tests are started. They are
of course run as part of the Python test suite.
msg63127 - (view) Author: Virgil Dupras (vdupras) (Python triager) Date: 2008-02-29 07:18
u':memory:'? That already worked before the patch because the implicit 
encoding with 'ascii' does not bump into any non-ascii character. Nope, 
one has to call connect with a filename containing non-ascii characters.
msg63249 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2008-03-04 09:21
Sorry, my patch corrected only sqlite3.Connection().
I join a new version, which also changes sqlite3.connect().

By the way, the test should test that the file is created with the
correct name. A call to os.stat() could be enough.
msg66209 - (view) Author: Gerhard Häring (ghaering) * (Python committer) Date: 2008-05-04 13:54
The implementation in SVN should in the meatntime behave like you expect
now. Look for database_utf8 = PyUnicode_AsUTF8String(database); in
connection.c to see the implementation.
msg71930 - (view) Author: Gerhard Häring (ghaering) * (Python committer) Date: 2008-08-25 14:45
Can we close this now? Did you try out a Python2.6 or Python 3.0 beta?
msg73952 - (view) Author: A.M. Kuchling (akuchling) * (Python committer) Date: 2008-09-28 00:26
vdupras's test case now passes with Python 2.6; we should apply 
the patch to the test suite, though.  We could ask Barry if he wants to
apply it to 2.6rc, or adding the test can wait until 2.7.
msg86485 - (view) Author: Jeroen Ruigrok van der Werven (asmodai) * (Python committer) Date: 2009-04-25 12:07
What do we want to do with this one, because it is now seems out of
scope for documentation given the changes Gerhard implemented.
msg86523 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2009-04-25 15:01
Looks like that's the case.
History
Date User Action Args
2009-04-25 15:01:49georg.brandlsetstatus: open -> closed
resolution: fixed
messages: + msg86523
2009-04-25 12:07:52asmodaisetnosy: + asmodai
messages: + msg86485
2008-09-28 00:26:30akuchlingsetnosy: + akuchling
messages: + msg73952
2008-08-25 14:45:06ghaeringsetmessages: + msg71930
2008-05-04 13:54:29ghaeringsetmessages: + msg66209
2008-03-04 09:22:02amaury.forgeotdarcsetfiles: - sqlite_connect.diff
2008-03-04 09:21:52amaury.forgeotdarcsetfiles: + sqlite_connect2.diff
messages: + msg63249
2008-02-29 07:18:32vduprassetmessages: + msg63127
2008-02-28 22:55:28ghaeringsetmessages: + msg63111
2008-02-28 15:57:00vduprassetfiles: + sqlite_connect_test.diff
keywords: + patch
messages: + msg63095
2008-02-27 20:10:02ghaeringsetseverity: normal -> minor
messages: + msg63078
priority: low
assignee: ghaering
keywords: + easy
type: enhancement
2008-02-27 20:05:20ghaeringsetnosy: + ghaering
2008-02-21 21:42:37amaury.forgeotdarcsetmessages: + msg62663
2008-02-21 18:45:12vduprassetmessages: + msg62639
2008-02-21 09:16:17amaury.forgeotdarcsetfiles: + sqlite_connect.diff
nosy: + amaury.forgeotdarc
messages: + msg62611
2008-02-21 02:10:32benjamin.petersonsetnosy: + georg.brandl
2008-02-19 11:53:45vduprassetnosy: + vdupras
messages: + msg62558
2008-02-16 14:08:47resolve1create