classification
Title: Optimize unpickling list-like objects: use extend() instead of append()
Type: performance Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: alexandre.vassalotti, haypo, pitrou, python-dev, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2017-01-25 09:55 by serhiy.storchaka, last changed 2017-03-31 16:36 by dstufft. This issue is now closed.

Files
File name Uploaded Description Edit
pickle-appends-extend.patch serhiy.storchaka, 2017-01-25 09:55 review
pickle-appends-extend-2.patch serhiy.storchaka, 2017-01-26 16:02 review
pickle-appends-extend-3.patch serhiy.storchaka, 2017-02-01 21:29 review
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017-03-31 16:36
Messages (17)
msg286237 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-25 09:55
According to PEP 307 the extend method can be used for appending list items to the object.

    listitems    Optional, and new in this PEP.
                 If this is not None, it should be an iterator (not a
                 sequence!) yielding successive list items.  These list
                 items will be pickled, and appended to the object using
                 either obj.append(item) or obj.extend(list_of_items).
                 This is primarily used for list subclasses, but may
                 be used by other classes as long as they have append()
                 and extend() methods with the appropriate signature.
                 (Whether append() or extend() is used depends on which
                 pickle protocol version is used as well as the number
                 of items to append, so both must be supported.)

Proposed patch makes the extend method be used in the APPENDS opcode. To avoid breaking existing code the use of the extend method is optional.

Microbenchmark:

$ ./python -m timeit -s "import pickle, collections; p = pickle.dumps(collections.deque([None]*10000), 4)" -- "pickle.loads(p)"
Unpatched:  100 loops, best of 5: 2.02 msec per loop
Patched:    500 loops, best of 5: 833 usec per loop
msg286242 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-01-25 12:16
What is the cost of adding an extra isinstance(d, collections.Sequence) check? It would be closer the current design ("white list" of types), safer and avoid bad surprises.

But Python has a long tradition of duck typing, so maybe isinstance() can be seen as overkill :-)
msg286243 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-25 12:22
collections.Sequence has no relation to the pickle protocol.

Every object that return listitems from the __reduce__() method must support the extend() method according to the specification.
msg286248 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-01-25 13:13
> Every object that return listitems from the __reduce__() method must support the extend() method according to the specification.

Hum ok. I don't know well the pickle module, but according to your quote, yeah, the patch is valid.

pickle-appends-extend.patch LGTM, except minor comments.

It would be nice to get a feedback from Alexandre, but I'm not sure that he is still around :-/

What about Antoine Pitrou, are you around? :-)
msg286250 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-25 13:18
See issue17720 for a feedback from Alexandre.
msg286297 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-01-26 08:18
This code looks correct and reasonable.  The tests all pass for me.  I think you can go forward and apply the patch.
msg286318 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-26 16:02
Updated patch addresses Antoine's comment. It adds the comment explaining a fallback.
msg286691 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-01 20:42
Could anyone please make a review of my explanation comment? I have doubts about a wording.
msg286701 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-02-01 21:17
> Could anyone please make a review of my explanation comment? I have doubts about a wording.

I'm not fluent in english, so I'm not the best for this task. But I reviewed your patch ;-)
msg286702 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-01 21:29
Thank you Victor. Your wording looks simpler to me.
msg286705 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-02-01 21:36
pickle-appends-extend-3.patch LGTM.

Even if I don't see any refleak, you might just run "./python -m test -R 3:3 test_pickle" just to be sure :-)
msg286735 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-02-02 04:17
> Could anyone please make a review of my explanation comment? 
> I have doubts about a wording.

The wording is correct and clear.
msg286751 - (view) Author: Roundup Robot (python-dev) Date: 2017-02-02 09:13
New changeset 94d630a02a81 by Serhiy Storchaka in branch 'default':
Issue #29368: The extend() method is now called instead of the append()
https://hg.python.org/cpython/rev/94d630a02a81
msg286753 - (view) Author: Roundup Robot (python-dev) Date: 2017-02-02 09:58
New changeset 328147c0edc3 by Victor Stinner in branch 'default':
Issue #29368: Fix _Pickle_FastCall() usage in do_append()
https://hg.python.org/cpython/rev/328147c0edc3
msg286754 - (view) Author: Roundup Robot (python-dev) Date:
New changeset f89fdc29937139b55dd68587759cadb8468d0190 by Serhiy Storchaka in branch 'master':
Issue #29368: The extend() method is now called instead of the append()
https://github.com/python/cpython/commit/f89fdc29937139b55dd68587759cadb8468d0190

New changeset 4d7e63d9773a766358294593fc00b1f8c8f41b5d by Victor Stinner in branch 'master':
Issue #29368: Fix _Pickle_FastCall() usage in do_append()
https://github.com/python/cpython/commit/4d7e63d9773a766358294593fc00b1f8c8f41b5d
msg286755 - (view) Author: STINNER Victor (haypo) * (Python committer) Date: 2017-02-02 10:01
> Even if I don't see any refleak, you might just run "./python -m test -R 3:3 test_pickle" just to be sure :-)

Change 94d630a02a81 introduced a crash in test_pickle:

Fatal Python error: ..\Modules\_pickle.c:5847 object at 000002B7F7BED2F8 has negative ref count -1

Seen on buildbots, but can always be reproduce on Linux as well.

It seems like you was biten by the surprising _Pickle_FastCall() API which decreases the reference counter of its second parameter. Don't ask me why it does that :-) (I don't know.)

I fixed the bug in the change 328147c0edc3 to repair buildbots.
msg286758 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-02 10:05
Thanks Victor.
History
Date User Action Args
2017-03-31 16:36:08dstufftsetpull_requests: + pull_request836
2017-02-02 10:05:48serhiy.storchakasetmessages: + msg286758
2017-02-02 10:01:07hayposetmessages: + msg286755
2017-02-02 10:00:20python-devsetstage: patch review -> resolved
2017-02-02 10:00:20python-devsetresolution: fixed
2017-02-02 10:00:20python-devsetstatus: open -> closed
2017-02-02 10:00:20python-devsetmessages: + msg286754
2017-02-02 09:58:40python-devsetmessages: + msg286753
2017-02-02 09:13:16python-devsetnosy: + python-dev
messages: + msg286751
2017-02-02 04:17:46rhettingersetmessages: + msg286735
2017-02-01 21:36:32hayposetmessages: + msg286705
2017-02-01 21:29:23serhiy.storchakasetfiles: + pickle-appends-extend-3.patch

messages: + msg286702
2017-02-01 21:17:45hayposetmessages: + msg286701
2017-02-01 20:42:02serhiy.storchakasetmessages: + msg286691
2017-01-26 16:02:41serhiy.storchakasetfiles: + pickle-appends-extend-2.patch

messages: + msg286318
2017-01-26 08:18:59rhettingersetassignee: serhiy.storchaka

messages: + msg286297
nosy: + rhettinger
2017-01-25 13:18:28serhiy.storchakasetmessages: + msg286250
2017-01-25 13:13:43hayposettitle: Optimize unpickling list-like objects -> Optimize unpickling list-like objects: use extend() instead of append()
2017-01-25 13:13:27hayposetnosy: + pitrou
messages: + msg286248
2017-01-25 12:22:42serhiy.storchakasetmessages: + msg286243
2017-01-25 12:16:39hayposetnosy: + haypo
messages: + msg286242
2017-01-25 09:55:50serhiy.storchakacreate