classification
Title: Pickle: use NEWOBJ instead of NEWOBJ_EX if possible
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: alexandre.vassalotti, pitrou, python-dev, serhiy.storchaka
Priority: normal Keywords: needs review, patch

Created on 2014-11-02 15:28 by serhiy.storchaka, last changed 2014-12-16 18:09 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
pickle_newobj.patch serhiy.storchaka, 2014-11-02 15:28 review
pickle_newobj_2.patch serhiy.storchaka, 2014-12-16 12:53 review
Messages (5)
msg230490 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-11-02 15:28
Currently in pickle with protocol 4 the NEWOBJ_EX opcode is used to reconstruct objects with __getnewargs__() or __getnewargs_ex__() even if there no keyword arguments.

>>> import pickle, pickletools, collections
>>> P = collections.namedtuple('P', 'x y')
>>> pickletools.dis(pickle.dumps(P(12, 34), 3))
    0: \x80 PROTO      3
    2: c    GLOBAL     '__main__ P'
   14: q    BINPUT     0
   16: K    BININT1    12
   18: K    BININT1    34
   20: \x86 TUPLE2
   21: q    BINPUT     1
   23: \x81 NEWOBJ
   24: q    BINPUT     2
   26: .    STOP
highest protocol among opcodes = 2
>>> pickletools.dis(pickle.dumps(P(12, 34), 4))
    0: \x80 PROTO      4
    2: \x95 FRAME      28
   11: \x8c SHORT_BINUNICODE '__main__'
   21: \x94 MEMOIZE
   22: \x8c SHORT_BINUNICODE 'P'
   25: \x94 MEMOIZE
   26: \x93 STACK_GLOBAL
   27: \x94 MEMOIZE
   28: K    BININT1    12
   30: K    BININT1    34
   32: \x86 TUPLE2
   33: \x94 MEMOIZE
   34: }    EMPTY_DICT
   35: \x94 MEMOIZE
   36: \x92 NEWOBJ_EX
   37: \x94 MEMOIZE
   38: .    STOP
highest protocol among opcodes = 4

These EMPTY_DICT//MEMOIZE//NEWOBJ_EX can be replaced by single NEWOBJ. Actually there is a regression in protocol 4 against protocol 3, so may be following patch should be applied on 3.4 too.

Here is a patch which uses __newobj__ instead of __newobj_ex__ if there are no keywords arguments. Also it reduce the number of lines of C code and adds tests for classes with __getnewargs__() and __getnewargs_ex__().
msg232660 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-12-15 12:35
Ping.
msg232686 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-12-15 22:06
Some examples (with issue19858 for protocol 4 optimization).

Unpatched:

>>> len(pickle.dumps([P(12, 34) for i in range(1000)], 3))
17258
>>> len(pickletools.optimize(pickle.dumps([P(12, 34) for i in range(1000)], 3)))
8018
>>> len(pickle.dumps([P(i, -i) for i in range(1000)], 3))
20999
>>> len(pickletools.optimize(pickle.dumps([P(i, -i) for i in range(1000)], 3)))
11759

>>> len(pickle.dumps([P(12, 34) for i in range(1000)], 4))
12031
>>> len(pickletools.optimize(pickle.dumps([P(12, 34) for i in range(1000)], 4)))
9028
>>> len(pickle.dumps([P(i, -i) for i in range(1000)], 4))
15772
>>> len(pickletools.optimize(pickle.dumps([P(i, -i) for i in range(1000)], 4)))
12769

Patched:

>>> len(pickle.dumps([P(12, 34) for i in range(1000)], 4))
10031
>>> len(pickletools.optimize(pickle.dumps([P(12, 34) for i in range(1000)], 4)))
8028
>>> len(pickle.dumps([P(i, -i) for i in range(1000)], 4))
13772
>>> len(pickletools.optimize(pickle.dumps([P(i, -i) for i in range(1000)], 4)))
11769
msg232739 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-12-16 12:53
Updated patch addresses Antoine's comments. Also added few comments explaining binary data.
msg232751 - (view) Author: Roundup Robot (python-dev) Date: 2014-12-16 18:03
New changeset 2ffaac4c8e53 by Serhiy Storchaka in branch 'default':
Issue #22783: Pickling now uses the NEWOBJ opcode instead of the NEWOBJ_EX
https://hg.python.org/cpython/rev/2ffaac4c8e53
History
Date User Action Args
2014-12-16 18:09:27serhiy.storchakasetstatus: open -> closed
assignee: serhiy.storchaka
resolution: fixed
stage: patch review -> resolved
2014-12-16 18:03:09python-devsetnosy: + python-dev
messages: + msg232751
2014-12-16 12:53:32serhiy.storchakasetfiles: + pickle_newobj_2.patch

messages: + msg232739
2014-12-15 22:06:54serhiy.storchakasetmessages: + msg232686
2014-12-15 12:35:55serhiy.storchakasetkeywords: + needs review

messages: + msg232660
2014-11-02 15:28:42serhiy.storchakacreate