classification
Title: Faster default __reduce__ for classes without __init__
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: alexandre.vassalotti, amaury.forgeotdarc, pitrou, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-02-09 11:00 by serhiy.storchaka, last changed 2015-03-20 16:41 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
object_reduce_no_init.patch serhiy.storchaka, 2015-02-09 11:00 review
Messages (4)
msg235600 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-02-09 11:00
Proposed patch makes faster default __reduce__ implementation for the case when there is no non-trivial __init__ defined (e.g. for named tuples). In this case __reduce__ will return (cls, newargs) instead of (copyreg.__newobj__, (cls,) + newargs).

>>> pickletools.dis(pickletools.optimize(pickle.dumps(turtle.Vec2D(12, 34), 3)))
Before:
    0: \x80 PROTO      3
    2: c    GLOBAL     'turtle Vec2D'
   16: K    BININT1    12
   18: K    BININT1    34
   20: \x86 TUPLE2
   21: \x81 NEWOBJ
   22: .    STOP
After:
    0: \x80 PROTO      3
    2: c    GLOBAL     'turtle Vec2D'
   16: K    BININT1    12
   18: K    BININT1    34
   20: \x86 TUPLE2
   21: R    REDUCE
   22: .    STOP

Pickled size is the same, but pickling is faster. The benefit is in avoiding of importing copyreg.__newobj__ and allocating new tuple (cls,) + newargs.

Microbenchmarks results:

$ ./python -m timeit -s "import pickle; from turtle import Vec2D; a = [Vec2D(i, i+0.1) for i in range(1000)]" -- "pickle.dumps(a)"

Before: 100 loops, best of 3: 16.3 msec per loop
After: 100 loops, best of 3: 15.2 msec per loop

$ ./python -m timeit -s "import copy; from turtle import Vec2D; a = [Vec2D(i, i+0.1) for i in range(1000)]" -- "copy.deepcopy(a)"

Before: 10 loops, best of 3: 96.6 msec per loop
After: 10 loops, best of 3: 88.7 msec per loop
msg238499 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2015-03-19 10:04
But isn't the result different?
NEWOBJ calls cls.__new__() and __init__ is skipped.
REDUCE calls cls(): both __new__ and __init__ are used.
msg238500 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2015-03-19 10:20
Sorry, I missed the important point:
"for classes without __init__"
msg238701 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-20 16:41
For a namedtuple such as turtle.Vec2D there is no significant difference in the time of unpickling. But for simpler type it is.

$ ./python -m timeit -s 'import pickle, turtle; global I' -s 'class I(int): pass' -s 'p = pickle.dumps([I(i) for i in range(1000)], 3)' -- 'pickle.loads(p)'

Before: 1000 loops, best of 3: 1.6 msec per loop
After:  1000 loops, best of 3: 1.82 msec per loop

So I withdraw my patch. Unpickling performance is more important than pickling performance, and status quo wins. Sorry for the noise.
History
Date User Action Args
2015-03-20 16:41:30serhiy.storchakasetstatus: open -> closed
messages: + msg238701

assignee: serhiy.storchaka
resolution: rejected
stage: patch review -> resolved
2015-03-19 10:20:17amaury.forgeotdarcsetmessages: + msg238500
2015-03-19 10:04:10amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg238499
2015-02-09 11:00:44serhiy.storchakacreate