This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author alexandre.vassalotti
Recipients Arfrever, alexandre.vassalotti, asvetlov, mstefanro, ncoghlan, neologix, pitrou, rhettinger, serhiy.storchaka
Date 2013-11-19.07:16:37
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1384845398.45.0.941599669229.issue17810@psf.upfronthosting.co.za>
In-reply-to
Content
I have been looking again at Stefan's previous proposal of making memoization implicit in the new pickle protocol. While I liked the smaller pickles it produced, I didn't the invasiveness of the implementation, which requires a change for almost every opcode processed by the Unpickler. This led me to, what I think is, a reasonable compromise between what we have right now and Stefan's proposal. That is we can make the argument of the PUT opcodes implicit, without making the whole opcode implicit.

I've implemented this by introducing a new opcode MEMOIZE, which stores the top of the pickle stack using the size of the memo as the index. Using the memo size as the index avoids us some extra bookkeeping variables and handles nicely situations where Pickler.memo.clear() or Unpickler.memo.clear() are used.

Size-wise, this brings some good improvements for pickles containing a lot of dicts and lists.

# Before
$ ./python.exe -c "import pickle; print(len(pickle.dumps([[] for _ in range(1000)], 4)))"
5251

# After with new MEMOIZE opcode
./python.exe -c "import pickle; print(len(pickle.dumps([[] for _ in range(1000)], 4)))"
2015

Time-wise, the change is mostly neutral. It makes pickling dicts and lists slightly faster because it simplifies the code for memo_put() in _pickle.

Report on Darwin Kernel Version 12.5.0: Sun Sep 29 13:33:47 PDT 2013; root:xnu-2050.48.12~1/RELEASE_X86_64 x86_64 i386
Total CPU cores: 4

### pickle4_dict ###
Min: 0.714912 -> 0.667203: 1.07x faster
Avg: 0.741616 -> 0.685567: 1.08x faster
Significant (t=16.25)
Stddev: 0.02033 -> 0.01346: 1.5102x smaller
Timeline: http://goo.gl/iHqCfB

### pickle4_list ###
Min: 0.414151 -> 0.398913: 1.04x faster
Avg: 0.432094 -> 0.409058: 1.06x faster
Significant (t=11.83)
Stddev: 0.01049 -> 0.00893: 1.1749x smaller
Timeline: http://goo.gl/wfQzgL

Anyhow, I have committed this improvement in my pep-3154 branch (http://hg.python.org/features/pep-3154-alexandre/rev/8a2861aaef82) for now, though I will happily revert it if people oppose to the change.
History
Date User Action Args
2013-11-19 07:16:38alexandre.vassalottisetrecipients: + alexandre.vassalotti, rhettinger, ncoghlan, pitrou, Arfrever, asvetlov, neologix, serhiy.storchaka, mstefanro
2013-11-19 07:16:38alexandre.vassalottisetmessageid: <1384845398.45.0.941599669229.issue17810@psf.upfronthosting.co.za>
2013-11-19 07:16:38alexandre.vassalottilinkissue17810 messages
2013-11-19 07:16:37alexandre.vassalotticreate