classification
Title: Faster pickling of instances
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: pitrou Nosy List: alexandre.vassalotti, belopolsky, jnoller, pitrou, python-dev, rhettinger
Priority: normal Keywords: patch

Created on 2010-09-24 01:06 by pitrou, last changed 2011-03-11 20:32 by pitrou. This issue is now closed.

Files
File name Uploaded Description Edit
pickleinst.patch pitrou, 2010-09-24 01:06 review
pickleinst2.patch pitrou, 2010-09-28 14:04 review
Messages (12)
msg117253 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-24 01:06
This is a bunch of assorted optimizations which make pickling of user-defined classes quite a bit faster.

Example on a minimal instance:

$ python -m timeit -s "import pickle; import collections, __main__; __main__.X=type('X', (), {}); x=X()" "pickle.dumps(x)"
-> before: 100000 loops, best of 3: 8.11 usec per loop
-> after: 100000 loops, best of 3: 2.95 usec per loop

Example on a namedtuple:

$ python -m timeit -s "import pickle; import collections, __main__; __main__.X=collections.namedtuple('X', 'a'); x=X(5)" "pickle.dumps(x)"
-> before: 100000 loops, best of 3: 9.52 usec per loop
-> after: 100000 loops, best of 3: 3.78 usec per loop

Unladen Swallow's pickling benchmark:

### pickle ###
Min: 0.792903 -> 0.704288: 1.1258x faster
Avg: 0.796241 -> 0.706073: 1.1277x faster
Significant (t=39.374217)
Stddev: 0.00410 -> 0.00307: 1.3342x smaller
Timeline: http://tinyurl.com/38elzvv
msg117517 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-28 13:21
My patch breaks pickling of transparent proxies such as weakref.proxy().
(since these have a different __class__ than Py_TYPE(self), through tp_getattr hackery). I will need to remove a couple of optimizations.

(unfortunately, there don't seem to be any tests for such case; my initial patch breaks neither test_pickle nor test_pickletools)
msg117522 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-28 14:04
Corrected patch, including new tests for pickling of weak proxies.
msg117982 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-04 21:22
Alexandre, do you have opinion on this?
msg117987 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2010-10-04 22:47
Sorry Antoine, I have been busy with school work lately.

I like the general idea and I will try to look at your patch ASAP.
msg119400 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-22 19:45
Committed in r85797.
msg119403 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-22 21:37
The commit broke the Windows buildbots because (un)pickling a TextIOWrapper now raises an exception:

>>> f = open("LICENSE")
>>> pickle.dumps(f)
b'\x80\x03c_io\nTextIOWrapper\nq\x00)\x81q\x01}q\x02X\x04\x00\x00\x00modeq\x03X\x01\x00\x00\x00rq\x04sb.'
>>> g = pickle.loads(pickle.dumps(f))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: '_io.TextIOWrapper' object has no attribute '__dict__'


It should be noted that it didn't work before, but no exception was raised. The result was just nonsensical:

>>> f = open("LICENSE")
>>> pickle.dumps(f)
b'\x80\x03c_io\nTextIOWrapper\nq\x00)\x81q\x01.'
>>> g = pickle.loads(pickle.dumps(f))
>>> g
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: I/O operation on uninitialized object


The very fact that test_multiprocessing tries to pickle a file object is unfortunate, and is probably a bug in itself. test_multiprocessing is known for pickling lots of things, since it generally transfers a whole TestCase instance...
msg119404 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-22 22:02
The difference has to do with the result of __reduce__:

With the patch:

>>> open("LICENSE").__reduce_ex__(3)
(<function __newobj__ at 0x7fa392a0ff30>, (<class '_io.TextIOWrapper'>,), {'mode': 'r'}, None, None)

Without:

>>> open("LICENSE").__reduce_ex__(3)
(<function __newobj__ at 0x7f2cf2361a70>, (<class '_io.TextIOWrapper'>,), None, None, None)
msg119552 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-10-25 14:08
I doubt I, or Ask will have the time to rewrite the entire multiprocessing test suite right now to work around the change Antoine.
msg119553 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-25 14:17
> I doubt I, or Ask will have the time to rewrite the entire
> multiprocessing test suite right now to work around the change
> Antoine.

Well, I'm not asking anyone to rewrite the entire multiprocessing test suite; and, besides, I've provided a patch myself to improve it in that respect ;) (in issue10173)

Of course, it also means the present pickle patch is imperfect, though the result of __reduce__ in this case looks more like a side-effect of an implementation detail than documented behaviour (since the result isn't usable anyway). I may try to come up with a better patch before the 3.2 beta but it's not sure I will find enough time/motivation.
msg119555 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-10-25 14:23
On Mon, Oct 25, 2010 at 10:17 AM, Antoine Pitrou <report@bugs.python.org> wrote:

> Well, I'm not asking anyone to rewrite the entire multiprocessing test suite; and, besides, I've provided a patch myself to improve it in that respect ;) (in issue10173)

I just saw that one - I'll poke at that next

> Of course, it also means the present pickle patch is imperfect, though the result of __reduce__ in this case looks more like a side-effect of an implementation detail than documented behaviour (since the result isn't usable anyway). I may try to come up with a better patch before the 3.2 beta but it's not sure I will find enough time/motivation.

Okie doke.
msg130611 - (view) Author: Roundup Robot (python-dev) Date: 2011-03-11 20:30
New changeset ff0220c9d213 by Antoine Pitrou in branch 'default':
Issue #9935: Speed up pickling of instances of user-defined classes.
http://hg.python.org/cpython/rev/ff0220c9d213
History
Date User Action Args
2011-03-11 20:32:54pitrousetstatus: open -> closed
nosy: rhettinger, belopolsky, pitrou, alexandre.vassalotti, jnoller, python-dev
resolution: fixed
stage: patch review -> resolved
2011-03-11 20:30:47python-devsetnosy: + python-dev
messages: + msg130611
2010-10-25 14:23:37jnollersetmessages: + msg119555
2010-10-25 14:17:22pitrousetresolution: fixed -> (no value)
messages: + msg119553
stage: resolved -> patch review
2010-10-25 14:08:16jnollersetmessages: + msg119552
2010-10-22 22:02:47pitrousetmessages: + msg119404
2010-10-22 21:37:57pitrousetstatus: closed -> open

nosy: + jnoller
messages: + msg119403

assignee: pitrou
2010-10-22 19:45:28pitrousetstatus: open -> closed
resolution: fixed
messages: + msg119400

stage: patch review -> resolved
2010-10-04 22:47:21alexandre.vassalottisetmessages: + msg117987
2010-10-04 21:22:03pitrousetmessages: + msg117982
2010-09-28 14:04:32pitrousetfiles: + pickleinst2.patch

messages: + msg117522
2010-09-28 13:21:28pitrousetmessages: + msg117517
2010-09-24 13:28:12pitrousetnosy: + rhettinger
2010-09-24 01:06:59pitroucreate