classification
Title: Improve pickle format for aware datetime instances
Type: behavior Stage: needs patch
Components: Versions: Python 3.3
process
Status: open Resolution:
Dependencies: Intern UTC timezone
View: 9183
Superseder:
Assigned To: belopolsky Nosy List: belopolsky, fdrake, mark.dickinson
Priority: normal Keywords: patch

Created on 2010-06-21 20:15 by belopolsky, last changed 2011-01-11 15:16 by belopolsky.

Files
File name Uploaded Description Edit
issue9051.diff belopolsky, 2010-06-22 03:26 review
issue9051-utc-pickle-proto.diff belopolsky, 2010-06-25 19:29
Messages (6)
msg108313 - (view) Author: Alexander Belopolsky (belopolsky) (Python committer) Date: 2010-06-21 20:15
>>> s = pickle.dumps(timezone.utc)
>>> pickle.loads(s)
Traceback (most recent call last):
 ..
TypeError: ("Required argument 'offset' (pos 1) not found", <class 'datetime.timezone'>, ())
msg108408 - (view) Author: Alexander Belopolsky (belopolsky) (Python committer) Date: 2010-06-22 18:39
Python version is fixed in sandbox and committed in r82154.
msg108492 - (view) Author: Alexander Belopolsky (belopolsky) (Python committer) Date: 2010-06-23 22:20
Committed in r82184.  Leaving the issue open pending a more thorough review of pickling in datetime module.
msg108608 - (view) Author: Alexander Belopolsky (belopolsky) (Python committer) Date: 2010-06-25 15:55
The datetime module provides compact pickled representation for date, datetime, time and timedelta instances:

type: size
date: 34
datetime: 44
time: 36
timedelta: 37


On the other hand, current pickle size for timezone is 64 and the size of an aware datetime instance is 105 bytes.

Since stability is important for pickle format, the best format should be developed before the first release to include timezone class.
msg108611 - (view) Author: Fred L. Drake, Jr. (fdrake) (Python committer) Date: 2010-06-25 16:56
As part of this, we should ensure references to common timezones, like
UTC, only create references to a single instance rather than filling
memory with multiple instances.

One consequence of this is that shared instances should probably be immutable.
msg108619 - (view) Author: Alexander Belopolsky (belopolsky) (Python committer) Date: 2010-06-25 19:29
I am attaching a python prototype implementing interned UTC instance pickling.  The patch is against sandbox revision r82218 of datetime.py.

Note that the pickling protocol requires that an instance or factory function is defined at the module level.

The pickle size saving is substantial:


>>> len(dumps(datetime.now(timezone.utc)))
61
>>> len(dumps(datetime.now(timezone.min)))
163

but there is still room for improvement:

>>> len(dumps(datetime.now()))
44

I do feel, however, that further improvements will see diminishing returns.
History
Date User Action Args
2011-01-11 15:16:45belopolskysetnosy: fdrake, mark.dickinson, belopolsky
versions: + Python 3.3, - Python 3.2
2010-07-28 18:01:38belopolskysetdependencies: + Intern UTC timezone
2010-06-25 19:29:17belopolskysetfiles: + issue9051-utc-pickle-proto.diff
keywords: + patch
messages: + msg108619
2010-06-25 16:56:49fdrakesetmessages: + msg108611
2010-06-25 15:55:07belopolskysetpriority: low -> normal

nosy: + fdrake
title: Cannot pickle timezone instances -> Improve pickle format for aware datetime instances
messages: + msg108608

stage: patch review -> needs patch
2010-06-23 22:20:32belopolskysetpriority: normal -> low
keywords: - patch, easy
messages: + msg108492
2010-06-22 18:39:06belopolskysetmessages: + msg108408
2010-06-22 03:26:51belopolskysetfiles: + issue9051.diff
2010-06-22 03:26:31belopolskysetfiles: - issue8455.diff
2010-06-22 03:26:11belopolskysetkeywords: + easy, patch
nosy: + mark.dickinson

files: + issue8455.diff
stage: test needed -> patch review
2010-06-21 20:15:07belopolskycreate