msg284205 - (view) |
Author: Bozo Kopic (bozo.kopic) * |
Date: 2016-12-28 23:04 |
Current convert_timestamp function raises exception when parsing timestamps that contain timezone information and have microsecond set to zero.
Patch for this behavior is included.
Example script that demonstrates this problem:
import sys
import datetime
import sqlite3
import traceback
import contextlib
def main():
db = sqlite3.connect('test.db', detect_types=sqlite3.PARSE_DECLTYPES)
db.executescript("""
CREATE TABLE IF NOT EXISTS test (
timestamp TIMESTAMP)
""")
t = datetime.datetime.now(tz=datetime.timezone.utc)
t = t.replace(microsecond=0)
db.executemany("INSERT INTO test VALUES (:timestamp)", [{'timestamp': t}])
db.commit()
with contextlib.closing(db.cursor()) as c:
try:
c.execute('SELECT * FROM test')
c.fetchall()
print('original implementation success')
except Exception as e:
print('original implementation failed')
traceback.print_exc()
c.close()
sqlite3.register_converter("timestamp", _sqlite_convert_timestamp)
with contextlib.closing(db.cursor()) as c:
try:
c.execute('SELECT * FROM test')
c.fetchall()
print('patched implementation success')
except Exception as e:
print('patched implementation failed')
traceback.print_exc()
c.close()
def _sqlite_convert_timestamp(val):
datepart, timepart = val.split(b" ")
# this is the patch
timepart = timepart.split(b'+', 1)[0].split(b'-', 1)[0]
year, month, day = map(int, datepart.split(b"-"))
timepart_full = timepart.split(b".")
hours, minutes, seconds = map(int, timepart_full[0].split(b":"))
if len(timepart_full) == 2:
microseconds = int('{:0<6.6}'.format(timepart_full[1].decode()))
else:
microseconds = 0
val = datetime.datetime(year, month, day, hours, minutes, seconds,
microseconds)
return val
if __name__ == '__main__':
sys.exit(main())
|
msg284231 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2016-12-29 07:07 |
Thanks for your report Bozo.
I think this is due to timestamp converter doesn't handle timezone now. Besides this ValueError, if I understand correctly, even if you pass a datetime object with tzinfo, with microsecond set, the returned datetime object is a wrong one since the tzinfo is discarded.
|
msg284240 - (view) |
Author: Bozo Kopic (bozo.kopic) * |
Date: 2016-12-29 09:56 |
Yes, that is correct. Timezone information is discarded. Submitted patch only resolves occurrence of ValueError.
Is additional patch that enables parsing of timezone required?
|
msg284241 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2016-12-29 10:06 |
I would prefer add the support for timezone since the current behaviour seems not correct to me. If you'd like to work on it, don't forget to add a test case and sign the CLA.
But I am not sure this should be treated as a bug or new feature.
|
msg284243 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-12-29 10:21 |
Could you provide a unittest?
|
msg284245 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2016-12-29 10:38 |
import sqlite3, datetime
c = sqlite3.connect(':memory:', detect_types=sqlite3.PARSE_DECLTYPES|sqlite3.PARSE_COLNAMES)
cur = c.cursor()
cur.execute('create table test(t timestamp)')
t = datetime.datetime.now(tz=datetime.timezone.utc)
cur.execute("insert into test(t) values (?)", (t,))
cur.execute('select t from test')
l = cur.fetchone()[0]
t == l # the result not equal to the original one
|
msg284252 - (view) |
Author: Bozo Kopic (bozo.kopic) * |
Date: 2016-12-29 12:31 |
I'm providing new patch that adds support for parsing timezone information.
Tests are included.
|
msg284291 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2016-12-29 19:04 |
In general the patch LGTM. Added comments on Rietveld.
|
msg284318 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2016-12-30 06:37 |
LGTM generally. :-)
|
msg285211 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-01-11 10:38 |
LGTM except that arguments of assertEqual() should be swapped.
|
msg285227 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-11 14:52 |
> LGTM except that arguments of assertEqual() should be swapped.
Thanks Serhiy! I'll take care of that. One problem remained is what to do with 2.7. There is no timezone object in 2.7. My preference is just pointing out the limitation that only naive datetime object is supported in the doc.
|
msg285229 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-01-11 15:29 |
Agreed. And maybe raise more informative error for datetimes with a timezone.
I'm also not sure about 3.5 and 3.6. Datetimes with timezones were never supported. Currently all returned ditetime objects are naive. Getting an aware datetime can confuse applications.
|
msg285230 - (view) |
Author: Marc-Andre Lemburg (lemburg) * |
Date: 2017-01-11 15:50 |
I don't think this can be considered a bug fix, since it changes behavior for applications that read data from SQLite databases which were not created by Python.
Those application may now see datetime values with tz infos and will likely not be prepared to handle all the problems associated with this.
Is it possible to make the support optional for 3.5 and 3.6 and only enable it for 3.7 (with the possibility of disabling it again) ?
|
msg285231 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-11 16:04 |
I asked whether this should be treated as an enhancement or bug fix. Now with your concerns I think it fits as enhancement more.
> Is it possible to make the support optional for 3.5 and 3.6 and only enable it for 3.7 (with the possibility of disabling it again) ?
How about just make it just into 3.7 and just document only naive datetime objects are supported in 3.5, 3.6 and 2.7.
I am not sure it's worth to make it even optional in 3.7. Discarding tzinfo sounds weird, and worse, the behaviour is not deterministic, it could also fail with an exception.
|
msg285232 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-01-11 16:08 |
The naive datetime converter is registered under the name "timestamp". The aware datetime converter or the universal datetime converter (if it is needed) can be registered under different names.
|
msg285243 - (view) |
Author: Marc-Andre Lemburg (lemburg) * |
Date: 2017-01-11 17:01 |
On 11.01.2017 17:08, Serhiy Storchaka wrote:
>
> The naive datetime converter is registered under the name "timestamp". The aware datetime converter or the universal datetime converter (if it is needed) can be registered under different names.
This sounds like a good idea.
Perhaps use "timestamptz" or something similar.
|
msg285245 - (view) |
Author: Marc-Andre Lemburg (lemburg) * |
Date: 2017-01-11 17:16 |
On 11.01.2017 17:04, Xiang Zhang wrote:
> I am not sure it's worth to make it even optional in 3.7. Discarding tzinfo sounds weird, and worse, the behaviour is not deterministic, it could also fail with an exception.
Best practice is to store all date/time values using UTC (or some
other fixed timezone) in databases and to manage the timezone/locale
information elsewhere.
The TZ info then becomes redundant and only results in more
database space being used as well as slower queries (due to the extra
TZ calculations being done; many databases internally store the
values as UTC to avoid having to do TZ calculations at query time).
Of course, there are use cases, where you'd still want to work
with TZ values in the database, so an extra converter for this
sounds like a good plan.
|
msg285287 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-12 05:45 |
timestamptz.patch implements a new converter that's able to convert aware datetime objects.
|
msg285288 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-01-12 06:06 |
I think the timestamptz converter should either interpret strings without timezone as UTC (and perhaps understand the "Z" suffix as sqlite3 date() function) or raises an error. It should never return naive datetime.
The timestamp converter needs better error reporting when get an input with a timezone.
|
msg285289 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-12 06:24 |
> I think the timestamptz converter should either interpret strings without timezone as UTC (and perhaps understand the "Z" suffix as sqlite3 date() function) or raises an error. It should never return naive datetime.
Currently timestamptz just convert back what the user passed in, no matter naive or aware objects. What to do with them is left to the app. If we raise an error, could users use naive and aware objects together? And interpreting strings without timezone as UTC seems will mistranslate the object. For example, pass in datetime.datetime.now() and translate it back as UTC may not be right.
> The timestamp converter needs better error reporting when get an input with a timezone.
I thought about it but our intention to introduce a new converter is not to break old code. Doesn't add error reporting violate the intention? Users' code may not catch the error now.
|
msg285291 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-01-12 07:28 |
Naive and aware objects should not be mixed. Their actually are different types. The converter doesn't return str or bytes when can't parse an input as a datetime, and it shouldn't return a naive datetime if can't find a timezone.
SQLite date and time functions imply UTC if timezone is omitted. This is a reason for returning aware UTC datetime objects. On other side, Python is more powerful programming language, it distinguish naive and aware datetime objects, and it is unlikely that these two types are mixed in one database column. It is better to raise an error that silently return possible wrong result. It exceptional case user can write special converter or just call SQLite datetime() for unifying data format.
I think the old code unlikely will be broken if preserve an exception type and don't change conditions for raising an error. New error message can contain full input string and suggest to use the timestamptz converter if it looks as a datetime with timezone.
|
msg285371 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-13 09:19 |
> I think the old code unlikely will be broken if preserve an exception type and don't change conditions for raising an error.
Currently there could be no error when the object gets a timezone. The timezone is simply discarded and only when microseconds comes to 0 there is a ValueError. I don't think the user code would prepare to catch the ValueError. I don't oppose make timestamp more strict but just not sure if it's suitable.
|
msg285372 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-13 09:23 |
timestamptz-2.patch make timestamp and timestamptz specilized for their own objects.
|
msg285375 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-01-13 10:21 |
I think ProgrammingError is not correct type of an exception. It is used for programming errors such as using uninitialized or closed objects or supplying incorrect number of bindings. But this error can be caused by invalid data in a database created not by Python. Currently converters raise ValueError or TypeError for invalid data. ValueError is raised for datetime without microseconds but with timezone offset. I think ValueError is more appropriate type for such errors.
Perhaps even TypeError should be converted to ValueError.
try:
# parsing date or datetime
except (ValueError, TypeError):
raise ValueError(...)
|
msg285377 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-01-13 10:28 |
> Perhaps even TypeError should be converted to ValueError.
But this is different issue of course.
|
msg285381 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-13 10:50 |
I am okay with ValueError(actually I use it in the patch originally) but I am not in favour of catching the errors of the parser. The parser raises errors due to invalid data, not timezone. I think propagate it to users could make the reason more obvious.
|
msg285385 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-13 11:23 |
timestamptz-3.patch uses ValueError instead of ProgrammingError and treat suffix Z as UTC timezone.
|
msg285985 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-22 04:04 |
Ping for review for timestamptz-3.patch.
|
msg285989 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-01-22 06:18 |
Added few minor comments on Rietveld. Technically the patch LGTM. But the behavior change of the timestamp converter for input without timezone offset perhaps needs wider discussion. We have several options:
0. Current behavior. Silently drop the timezone if there are microseconds, and raise an error when there are no microseconds. This is bad for many reasons, but some user code can depend on this.
1. Just drop the timezone, don't raise an error when there are no microseconds. The converter will become returning incorrect result instead of just failing on unexpected input. This will fix the user code that is aware of this, but currently fails when input doesn't have microseconds.
2. Return aware datetime for input with timezone offset (as in original patch). Returning aware or naive datetime depending on input can break a user code that is not prepared for this.
3. Raise an error for input with timezone offset, always return naive datetime objects from this converter. This can break the user code that depends on current behavior and works with the input containing the same timezone offsets.
Any option can break some code. I prefer option 3, but want to hear thoughts of other core developers. Maybe discuss this on Python-Dev?
|
msg285990 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-22 06:24 |
> Maybe discuss this on Python-Dev?
It's fine. Could you compose a mail instead of me? I am not good at that. :-(
|
msg285992 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2017-01-22 06:32 |
I'm even worse at that. :-(
|
msg285993 - (view) |
Author: Xiang Zhang (xiang.zhang) * |
Date: 2017-01-22 06:33 |
> I'm even worse at that. :-(
LoL. Okay I'll do that. :-)
|
msg377068 - (view) |
Author: Hassanbot (hassanbot) |
Date: 2020-09-17 21:51 |
This still isn't fixed as of 3.8 (or in master I think).
I can understand why you wouldn't want to allow serializing and deserializing time zones, since tzinfo objects cannot be accurately serialized with a simple UTC offset, but you should at least get an error when trying to insert an aware object. Anything is better than it is now, where you get no warning or error when inserting the object, and get a hard to interpret error ("invalid literal for int() with base 10") when trying to retrieve it.
For deserialization, the datetime class now (since 3.7) includes a fromisoformat() method that could be used as a counterpart to the isoformat() method used when serializing. At least it would be consistent then.
|
msg377090 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2020-09-18 07:14 |
Xiang Zhang, was there a discussion on Python-Dev?
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:41 | admin | set | github: 73285 |
2020-09-18 07:14:50 | serhiy.storchaka | set | messages:
+ msg377090 |
2020-09-17 21:51:20 | hassanbot | set | nosy:
+ hassanbot
messages:
+ msg377068 versions:
+ Python 3.8, - Python 3.7 |
2017-01-22 06:33:59 | xiang.zhang | set | messages:
+ msg285993 |
2017-01-22 06:32:20 | serhiy.storchaka | set | messages:
+ msg285992 |
2017-01-22 06:24:38 | xiang.zhang | set | messages:
+ msg285990 |
2017-01-22 06:18:23 | serhiy.storchaka | set | messages:
+ msg285989 |
2017-01-22 04:04:44 | xiang.zhang | set | messages:
+ msg285985 |
2017-01-13 11:23:12 | xiang.zhang | set | files:
+ timestamptz-3.patch
messages:
+ msg285385 |
2017-01-13 10:50:52 | xiang.zhang | set | messages:
+ msg285381 |
2017-01-13 10:28:31 | serhiy.storchaka | set | messages:
+ msg285377 |
2017-01-13 10:21:51 | serhiy.storchaka | set | messages:
+ msg285375 |
2017-01-13 09:23:59 | xiang.zhang | set | files:
+ timestamptz-2.patch
messages:
+ msg285372 |
2017-01-13 09:19:45 | xiang.zhang | set | messages:
+ msg285371 |
2017-01-12 07:28:40 | serhiy.storchaka | set | messages:
+ msg285291 |
2017-01-12 06:24:39 | xiang.zhang | set | messages:
+ msg285289 |
2017-01-12 06:06:22 | serhiy.storchaka | set | messages:
+ msg285288 |
2017-01-12 05:45:38 | xiang.zhang | set | files:
+ timestamptz.patch
messages:
+ msg285287 stage: needs patch -> patch review |
2017-01-11 17:16:25 | lemburg | set | messages:
+ msg285245 |
2017-01-11 17:06:28 | serhiy.storchaka | set | stage: commit review -> needs patch type: behavior -> enhancement versions:
- Python 3.5, Python 3.6 |
2017-01-11 17:01:32 | lemburg | set | messages:
+ msg285243 |
2017-01-11 16:08:15 | serhiy.storchaka | set | messages:
+ msg285232 |
2017-01-11 16:04:29 | xiang.zhang | set | messages:
+ msg285231 |
2017-01-11 15:50:16 | lemburg | set | messages:
+ msg285230 |
2017-01-11 15:29:35 | serhiy.storchaka | set | messages:
+ msg285229 |
2017-01-11 14:52:40 | xiang.zhang | set | messages:
+ msg285227 |
2017-01-11 10:38:56 | serhiy.storchaka | set | messages:
+ msg285211 stage: patch review -> commit review |
2017-01-11 09:21:51 | xiang.zhang | set | files:
+ sqlite3-3.patch |
2016-12-30 06:37:21 | xiang.zhang | set | messages:
+ msg284318 |
2016-12-29 19:04:43 | serhiy.storchaka | set | messages:
+ msg284291 |
2016-12-29 18:21:21 | serhiy.storchaka | set | files:
+ sqlite3-2.patch stage: test needed -> patch review |
2016-12-29 17:49:15 | bozo.kopic | set | files:
+ sqlite3-2.patch |
2016-12-29 17:49:05 | bozo.kopic | set | files:
- sqlite3-2.patch |
2016-12-29 12:36:00 | bozo.kopic | set | files:
+ sqlite3-2.patch |
2016-12-29 12:34:52 | bozo.kopic | set | files:
- sqlite3-2.patch |
2016-12-29 12:31:59 | bozo.kopic | set | files:
+ sqlite3-2.patch
messages:
+ msg284252 |
2016-12-29 10:38:36 | xiang.zhang | set | messages:
+ msg284245 |
2016-12-29 10:21:03 | serhiy.storchaka | set | nosy:
+ lemburg, ghaering, belopolsky
messages:
+ msg284243 stage: test needed |
2016-12-29 10:06:56 | xiang.zhang | set | nosy:
+ serhiy.storchaka messages:
+ msg284241
|
2016-12-29 09:56:03 | bozo.kopic | set | messages:
+ msg284240 |
2016-12-29 07:07:00 | xiang.zhang | set | title: sqlite3 timestamp converter ValueError -> sqlite3 timestamp converter cannot handle timezone nosy:
+ xiang.zhang
messages:
+ msg284231
type: crash -> behavior |
2016-12-28 23:04:29 | bozo.kopic | create | |