Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize unpickling list-like objects: use extend() instead of append() #73554

Closed
serhiy-storchaka opened this issue Jan 25, 2017 · 18 comments
Closed
Assignees
Labels
3.7 (EOL) end of life performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@serhiy-storchaka
Copy link
Member

BPO 29368
Nosy @rhettinger, @gpshead, @pitrou, @vstinner, @avassalotti, @serhiy-storchaka
Files
  • pickle-appends-extend.patch
  • pickle-appends-extend-2.patch
  • pickle-appends-extend-3.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2017-02-02.10:00:20.573>
    created_at = <Date 2017-01-25.09:55:50.286>
    labels = ['3.7', 'library', 'performance']
    title = 'Optimize unpickling list-like objects: use extend() instead of append()'
    updated_at = <Date 2021-04-27.00:38:24.023>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2021-04-27.00:38:24.023>
    actor = 'gregory.p.smith'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2017-02-02.10:00:20.573>
    closer = 'python-dev'
    components = ['Library (Lib)']
    creation = <Date 2017-01-25.09:55:50.286>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['46412', '46421', '46484']
    hgrepos = []
    issue_num = 29368
    keywords = ['patch']
    message_count = 18.0
    messages = ['286237', '286242', '286243', '286248', '286250', '286297', '286318', '286691', '286701', '286702', '286705', '286735', '286751', '286753', '286754', '286755', '286758', '392010']
    nosy_count = 7.0
    nosy_names = ['rhettinger', 'gregory.p.smith', 'pitrou', 'vstinner', 'alexandre.vassalotti', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue29368'
    versions = ['Python 3.7']

    @serhiy-storchaka
    Copy link
    Member Author

    According to PEP-307 the extend method can be used for appending list items to the object.

    listitems    Optional, and new in this PEP.
                 If this is not None, it should be an iterator (not a
                 sequence!) yielding successive list items.  These list
                 items will be pickled, and appended to the object using
                 either obj.append(item) or obj.extend(list_of_items).
                 This is primarily used for list subclasses, but may
                 be used by other classes as long as they have append()
                 and extend() methods with the appropriate signature.
                 (Whether append() or extend() is used depends on which
                 pickle protocol version is used as well as the number
                 of items to append, so both must be supported.)
    

    Proposed patch makes the extend method be used in the APPENDS opcode. To avoid breaking existing code the use of the extend method is optional.

    Microbenchmark:

    $ ./python -m timeit -s "import pickle, collections; p = pickle.dumps(collections.deque([None]*10000), 4)" -- "pickle.loads(p)"
    Unpatched:  100 loops, best of 5: 2.02 msec per loop
    Patched:    500 loops, best of 5: 833 usec per loop

    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life stdlib Python modules in the Lib dir performance Performance or resource usage labels Jan 25, 2017
    @vstinner
    Copy link
    Member

    What is the cost of adding an extra isinstance(d, collections.Sequence) check? It would be closer the current design ("white list" of types), safer and avoid bad surprises.

    But Python has a long tradition of duck typing, so maybe isinstance() can be seen as overkill :-)

    @serhiy-storchaka
    Copy link
    Member Author

    collections.Sequence has no relation to the pickle protocol.

    Every object that return listitems from the __reduce__() method must support the extend() method according to the specification.

    @vstinner
    Copy link
    Member

    Every object that return listitems from the __reduce__() method must support the extend() method according to the specification.

    Hum ok. I don't know well the pickle module, but according to your quote, yeah, the patch is valid.

    pickle-appends-extend.patch LGTM, except minor comments.

    It would be nice to get a feedback from Alexandre, but I'm not sure that he is still around :-/

    What about Antoine Pitrou, are you around? :-)

    @vstinner vstinner changed the title Optimize unpickling list-like objects Optimize unpickling list-like objects: use extend() instead of append() Jan 25, 2017
    @serhiy-storchaka
    Copy link
    Member Author

    See bpo-17720 for a feedback from Alexandre.

    @rhettinger
    Copy link
    Contributor

    This code looks correct and reasonable. The tests all pass for me. I think you can go forward and apply the patch.

    @serhiy-storchaka
    Copy link
    Member Author

    Updated patch addresses Antoine's comment. It adds the comment explaining a fallback.

    @serhiy-storchaka
    Copy link
    Member Author

    Could anyone please make a review of my explanation comment? I have doubts about a wording.

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2017

    Could anyone please make a review of my explanation comment? I have doubts about a wording.

    I'm not fluent in english, so I'm not the best for this task. But I reviewed your patch ;-)

    @serhiy-storchaka
    Copy link
    Member Author

    Thank you Victor. Your wording looks simpler to me.

    @vstinner
    Copy link
    Member

    vstinner commented Feb 1, 2017

    pickle-appends-extend-3.patch LGTM.

    Even if I don't see any refleak, you might just run "./python -m test -R 3:3 test_pickle" just to be sure :-)

    @rhettinger
    Copy link
    Contributor

    Could anyone please make a review of my explanation comment?
    I have doubts about a wording.

    The wording is correct and clear.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 2, 2017

    New changeset 94d630a02a81 by Serhiy Storchaka in branch 'default':
    Issue bpo-29368: The extend() method is now called instead of the append()
    https://hg.python.org/cpython/rev/94d630a02a81

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 2, 2017

    New changeset 328147c0edc3 by Victor Stinner in branch 'default':
    Issue bpo-29368: Fix _Pickle_FastCall() usage in do_append()
    https://hg.python.org/cpython/rev/328147c0edc3

    @python-dev python-dev mannequin closed this as completed Feb 2, 2017
    @vstinner
    Copy link
    Member

    vstinner commented Feb 2, 2017

    Even if I don't see any refleak, you might just run "./python -m test -R 3:3 test_pickle" just to be sure :-)

    Change 94d630a02a81 introduced a crash in test_pickle:

    Fatal Python error: ..\Modules\_pickle.c:5847 object at 000002B7F7BED2F8 has negative ref count -1

    Seen on buildbots, but can always be reproduce on Linux as well.

    It seems like you was biten by the surprising _Pickle_FastCall() API which decreases the reference counter of its second parameter. Don't ask me why it does that :-) (I don't know.)

    I fixed the bug in the change 328147c0edc3 to repair buildbots.

    @serhiy-storchaka
    Copy link
    Member Author

    Thanks Victor.

    @gpshead
    Copy link
    Member

    gpshead commented Apr 27, 2021

    The PyList_Check -> PyList_CheckExact change led to bugs where subclasses of list that override extend can no longer be unpickled.

    https://bugs.python.org/issue43946

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 9, 2022

    New changeset f89fdc29937139b55dd68587759cadb8468d0190 by Serhiy Storchaka in branch 'master':
    Issue bpo-29368: The extend() method is now called instead of the append()
    f89fdc2

    New changeset 4d7e63d9773a766358294593fc00b1f8c8f41b5d by Victor Stinner in branch 'master':
    Issue bpo-29368: Fix _Pickle_FastCall() usage in do_append()
    4d7e63d

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants