msg108313 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2010-06-21 20:15 |
>>> s = pickle.dumps(timezone.utc)
>>> pickle.loads(s)
Traceback (most recent call last):
..
TypeError: ("Required argument 'offset' (pos 1) not found", <class 'datetime.timezone'>, ())
|
msg108408 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2010-06-22 18:39 |
Python version is fixed in sandbox and committed in r82154.
|
msg108492 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2010-06-23 22:20 |
Committed in r82184. Leaving the issue open pending a more thorough review of pickling in datetime module.
|
msg108608 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2010-06-25 15:55 |
The datetime module provides compact pickled representation for date, datetime, time and timedelta instances:
type: size
date: 34
datetime: 44
time: 36
timedelta: 37
On the other hand, current pickle size for timezone is 64 and the size of an aware datetime instance is 105 bytes.
Since stability is important for pickle format, the best format should be developed before the first release to include timezone class.
|
msg108611 - (view) |
Author: Fred Drake (fdrake)  |
Date: 2010-06-25 16:56 |
As part of this, we should ensure references to common timezones, like
UTC, only create references to a single instance rather than filling
memory with multiple instances.
One consequence of this is that shared instances should probably be immutable.
|
msg108619 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2010-06-25 19:29 |
I am attaching a python prototype implementing interned UTC instance pickling. The patch is against sandbox revision r82218 of datetime.py.
Note that the pickling protocol requires that an instance or factory function is defined at the module level.
The pickle size saving is substantial:
>>> len(dumps(datetime.now(timezone.utc)))
61
>>> len(dumps(datetime.now(timezone.min)))
163
but there is still room for improvement:
>>> len(dumps(datetime.now()))
44
I do feel, however, that further improvements will see diminishing returns.
|
msg221920 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2014-06-29 23:53 |
Your latest patch doesn't have a review link. Would you like to regenerate it against the latest default?
|
msg221933 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2014-06-30 01:26 |
I updated the patch. Please note that it only includes python code, so you need to suppress _datetime acceleration to test:
>>> import sys
>>> sys.modules['_datetime'] = None
>>> from pickle import *
>>> from datetime import *
>>> dumps(timezone.utc)
b'\x80\x03cdatetime\n_utc\nq\x00.'
|
msg221934 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2014-06-30 01:29 |
Hmm. Still no "review" link. Is there something special that I need to do to get it? Use different name, maybe?
|
msg221935 - (view) |
Author: Ned Deily (ned.deily) *  |
Date: 2014-06-30 01:33 |
FWIW, I see review links for both of your files: all the way over to the right in the Files section.
|
msg221936 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2014-06-30 01:37 |
> I see review links for both of your files
Now I do too. I guess they just take time to appear. Note that I unlinked the file that Antoine complained about.
|
msg251342 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-09-22 19:16 |
For stability it is better to use public name 'timezone.utc' instead of '_utc'. Dotted names now are supported with all protocols.
> but there is still room for improvement:
Don't worry about this. When you pickle a number of datetime objects with the same timezone, the timezone is pickled only once and for all other datetime objects only a reference is used.
|
msg251668 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2015-09-26 22:54 |
> For stability it is better to use public name 'timezone.utc' instead of '_utc'.
Can you elaborate on the "stability" consideration?
I would like to revisit this issue since we will have some changes made to datetime pickles in the context of PEP 495.
|
msg251677 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-09-27 03:12 |
For now "_utc" is not a part of API. When use it in pickling, you implicitly make it a part of API, and should support it forever (and force other Python implementations to support it). As far as it looks as implementation detail, there is a risk that the "_utc" variable can be renamed and this will break pickle compatibility.
|
msg251812 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2015-09-29 02:23 |
> there is a risk that the "_utc" variable can be renamed
> and this will break pickle compatibility.
I think we will still need to maintain _utc global indefinitely to keep the old pickles readable.
On the other hand, it looks like support for qualnames is only available with pickle protocol 4, so I don't see any downside from from storing the full qualname other than an 8-byte increase in the size of the pickle.
I guess my position on this will be +0: I'll accept a patch if someone will submit it, but I am unlikely to do it myself.
|
msg251814 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-09-29 02:58 |
> I think we will still need to maintain _utc global indefinitely to keep the old pickles readable.
We have no need to maintain _utc global indefinitely if don't add it.
> On the other hand, it looks like support for qualnames is only available with pickle protocol 4,
No, support for qualnames now is available with all pickle protocols.
|
msg253402 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-10-24 09:58 |
Following patch adds tests for timezone pickling and copying. Note, that timezone.utc identity is preserved.
|
msg254721 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2015-11-16 09:21 |
New changeset e48da8f01316 by Serhiy Storchaka in branch '3.4':
Issue #9051: Added tests for pickling and copying the timezone objects.
https://hg.python.org/cpython/rev/e48da8f01316
New changeset 634905e9628d by Serhiy Storchaka in branch '3.5':
Issue #9051: Added tests for pickling and copying the timezone objects.
https://hg.python.org/cpython/rev/634905e9628d
New changeset 9c31544d6a66 by Serhiy Storchaka in branch 'default':
Issue #9051: Added tests for pickling and copying the timezone objects.
https://hg.python.org/cpython/rev/9c31544d6a66
|
msg254724 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2015-11-16 09:53 |
>>> import pickle, pickletools, datetime
>>> len(pickle.dumps([datetime.datetime.utcfromtimestamp(i) for i in range(10000)], 3)) / 10000
30.9283
>>> len(pickle.dumps([datetime.datetime.fromtimestamp(i, datetime.timezone.utc) for i in range(10000)], 3)) / 10000
32.936
>>> len(pickle.dumps([datetime.datetime.utcfromtimestamp(i) for i in range(10000)], 4)) / 10000
19.0074
>>> len(pickle.dumps([datetime.datetime.fromtimestamp(i, datetime.timezone.utc) for i in range(10000)], 4)) / 10000
21.0144
In best case the UTC timezone costs only 2 bytes per datetime instance (BINGET + 1-byte index). In worst case it can cost 5 bytes (LONG_BINGET + 4-bytes index). Saving few bytes for single timezone instance has significant effect only for small pickled data. But in this case it is more advantageous to save a datetime just as a timestamp. I suggest to close this issue.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:02 | admin | set | github: 53297 |
2017-03-06 17:30:07 | belopolsky | set | status: pending -> closed resolution: rejected stage: patch review -> resolved |
2017-03-06 13:17:56 | serhiy.storchaka | set | status: open -> pending |
2015-11-16 09:54:00 | serhiy.storchaka | set | messages:
+ msg254724 |
2015-11-16 09:21:18 | python-dev | set | nosy:
+ python-dev messages:
+ msg254721
|
2015-10-24 09:58:03 | serhiy.storchaka | set | files:
+ timezone_test_pickle.patch
messages:
+ msg253402 stage: needs patch -> patch review |
2015-09-29 02:58:17 | serhiy.storchaka | set | messages:
+ msg251814 |
2015-09-29 02:24:12 | belopolsky | set | nosy:
+ tim.peters
|
2015-09-29 02:23:56 | belopolsky | set | messages:
+ msg251812 |
2015-09-27 03:12:50 | serhiy.storchaka | set | messages:
+ msg251677 |
2015-09-26 22:54:31 | belopolsky | set | messages:
+ msg251668 |
2015-09-22 19:16:14 | serhiy.storchaka | set | versions:
+ Python 3.6, - Python 3.5 nosy:
+ serhiy.storchaka
messages:
+ msg251342
type: behavior -> enhancement |
2014-07-01 05:35:24 | rhettinger | set | title: Improve pickle format for aware datetime instances -> Improve pickle format for timezone aware datetime instances |
2014-06-30 01:37:16 | belopolsky | set | messages:
+ msg221936 |
2014-06-30 01:33:55 | ned.deily | set | nosy:
+ ned.deily messages:
+ msg221935
|
2014-06-30 01:29:39 | belopolsky | set | versions:
+ Python 3.5, - Python 3.3 |
2014-06-30 01:29:20 | belopolsky | set | messages:
+ msg221934 |
2014-06-30 01:27:52 | belopolsky | set | files:
- issue9051-utc-pickle-proto.diff |
2014-06-30 01:26:54 | belopolsky | set | files:
+ issue9051-utc-pickle-proto.diff
messages:
+ msg221933 |
2014-06-29 23:53:43 | pitrou | set | messages:
+ msg221920 |
2014-06-29 20:08:53 | belopolsky | set | nosy:
+ pitrou, vstinner, alexandre.vassalotti
|
2011-01-11 15:16:45 | belopolsky | set | nosy:
fdrake, mark.dickinson, belopolsky versions:
+ Python 3.3, - Python 3.2 |
2010-07-28 18:01:38 | belopolsky | set | dependencies:
+ Intern UTC timezone |
2010-06-25 19:29:17 | belopolsky | set | files:
+ issue9051-utc-pickle-proto.diff keywords:
+ patch messages:
+ msg108619
|
2010-06-25 16:56:49 | fdrake | set | messages:
+ msg108611 |
2010-06-25 15:55:07 | belopolsky | set | priority: low -> normal
nosy:
+ fdrake title: Cannot pickle timezone instances -> Improve pickle format for aware datetime instances messages:
+ msg108608
stage: patch review -> needs patch |
2010-06-23 22:20:32 | belopolsky | set | priority: normal -> low keywords:
- patch, easy messages:
+ msg108492
|
2010-06-22 18:39:06 | belopolsky | set | messages:
+ msg108408 |
2010-06-22 03:26:51 | belopolsky | set | files:
+ issue9051.diff |
2010-06-22 03:26:31 | belopolsky | set | files:
- issue8455.diff |
2010-06-22 03:26:11 | belopolsky | set | keywords:
+ easy, patch nosy:
+ mark.dickinson
files:
+ issue8455.diff stage: test needed -> patch review |
2010-06-21 20:15:07 | belopolsky | create | |