Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pure Python pickle module should not depend on _pickle.PickleBuffer #81391

Closed
tiran opened this issue Jun 9, 2019 · 15 comments
Closed

Pure Python pickle module should not depend on _pickle.PickleBuffer #81391

tiran opened this issue Jun 9, 2019 · 15 comments
Labels
3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@tiran
Copy link
Member

tiran commented Jun 9, 2019

BPO 37210
Nosy @pitrou, @vstinner, @tiran, @moreati, @miss-islington
PRs
  • bpo-37210: Fix pure Python pickle: proto 5 requires _pickle.PickleBuffer #14016
  • [3.8] bpo-37210: Fix pure Python pickle when _pickle is unavailable (GH-14016) #14055
  • Files
  • bp0-37201_make_test_20190610.txt: make test output for pickle._PickleBuffer attempt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-06-13.12:28:44.146>
    created_at = <Date 2019-06-09.20:23:32.229>
    labels = ['3.8', 'type-bug', 'library', '3.9']
    title = 'Pure Python pickle module should not depend on _pickle.PickleBuffer'
    updated_at = <Date 2019-06-13.12:28:44.145>
    user = 'https://github.com/tiran'

    bugs.python.org fields:

    activity = <Date 2019-06-13.12:28:44.145>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-06-13.12:28:44.146>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2019-06-09.20:23:32.229>
    creator = 'christian.heimes'
    dependencies = []
    files = ['48410']
    hgrepos = []
    issue_num = 37210
    keywords = ['patch', '3.8regression']
    message_count = 15.0
    messages = ['345088', '345089', '345090', '345093', '345094', '345095', '345197', '345198', '345261', '345262', '345360', '345362', '345503', '345511', '345512']
    nosy_count = 5.0
    nosy_names = ['pitrou', 'vstinner', 'christian.heimes', 'Alex.Willmer', 'miss-islington']
    pr_nums = ['14016', '14055']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue37210'
    versions = ['Python 3.8', 'Python 3.9']

    @tiran
    Copy link
    Member Author

    tiran commented Jun 9, 2019

    In https://twitter.com/moreati/status/1137815804530049028 Alex Willmer wrote:

    In CPython 3.8dev the pure http://pickle.py module now depends on the C _pickle module, but only for one class

    from _pickle import PickleBuffer

    Antoine, would it make sense turn _pickle.PickleBuffer into an optional dependency? The class is only used for pickle 5 as described in PEP-574, https://www.python.org/dev/peps/pep-0574/

    @tiran tiran added 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jun 9, 2019
    @moreati
    Copy link
    Mannequin

    moreati mannequin commented Jun 9, 2019

    I noticed this because I was experimenting with pickle.py from the 3.8 branch to support protocol 5 in https://github.com/moreati/pikl (and later to https://pypi.org/project/zodbpickle/).

    However I want to make it clear, if CPython maintainers wish to keep the code as it is I will deal with it. Maximizing our convenience, and minimizing your maintenance workload is paramount.

    @moreati
    Copy link
    Mannequin

    moreati mannequin commented Jun 9, 2019

    Arggh, typo. I mean maximizing *your* convenience is paramount.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 9, 2019

    Note that PickleBuffer is really a built-in type (it's defined in Objects/picklebufobject.c). However it's exposed from _pickle.c because it doesn't make sense to expose it in the builtins module.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 9, 2019

    In short I'm not against making PickleBuffer optional but I'm not sure it makes a lot of sense. Would you make memoryview optional because it's written in C? Is anyone annoyed by the fact that something comes from _pickle.c.

    (note that the pure Python Pickler class *does* work with PickleBuffer; so you don't lose any extensability or customizability by using PickleBuffer)

    In any case, I won't do the work myself, so feel free to draft a PR so that we see the magnitude of the changes required.

    @moreati
    Copy link
    Mannequin

    moreati mannequin commented Jun 9, 2019

    Attempting a PR

    @moreati
    Copy link
    Mannequin

    moreati mannequin commented Jun 11, 2019

    I don't think I can do this. My WIP code is in https://github.com/moreati/cpython/pull/new/bpo-37210, and associated make test output is attached.

    Principal blockers

    • _pickle.PickleBuffer.raw() can return a contiguous buffer from either a c_contiguous, or f_contiguous buffer. memoryview() (and hence pickle._PickleBuffer.raw()) requires a c_contiguous buffer.
    • I'm unsure how to allow _PickleBuffer -> bytearray conversion

    The most common cause of test failures is TypeError: cannot pickle 'memoryview' object. I guess Python is seeing a lack of reduce(), and failing back to walking the object attributes. Implementing reduce() and or reduce_ex() might fix this, but seems moot given the other blockers.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 11, 2019

    Right, it is probably not possible to write a pure Python PickleBuffer. There's a reason it's written in C...

    When you said "make PickleBuffer an optional dependency", I thought you intended to have a usable pure Python Pickler, but without support for the PickleBuffer class.

    @moreati
    Copy link
    Mannequin

    moreati mannequin commented Jun 11, 2019

    it is probably not possible to write a pure Python PickleBuffer

    Fair enough

    a usable pure Python Pickler, but without support for the PickleBuffer class.

    That makes sense. However, for third party packages (e.g. zodbpickle, pikl) wanting a pure Python pickle module to use as a basis, I think it would make more sense to use the CPython 3.7 branch as a basis. Adding a pickle._Pickler variant to the 3.8 stdlib that can't do protocol 5, just for third-parties imposes a maintenance burden on the core devs for zero benefit to anyone.

    I propose closing this ticket as WONTFIX.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 11, 2019

    Ok, closing then.

    @pitrou pitrou closed this as completed Jun 11, 2019
    @vstinner
    Copy link
    Member

    FYI pyperformance is affected by this issue:

    vstinner@apu$ env/bin/python ~/prog/python/pyperformance/pyperformance/benchmarks/bm_pickle.py unpickle --pure-python -v -p3  
    Traceback (most recent call last):
      File "/home/vstinner/prog/python/pyperformance/pyperformance/benchmarks/bm_pickle.py", line 287, in <module>
        import pickle
      File "/home/vstinner/prog/python/master/Lib/pickle.py", line 39, in <module>
        from _pickle import PickleBuffer
    ModuleNotFoundError: import of _pickle halted; None in sys.modules

    It is no longer possible to benchmark the pure Python implement of pickle.

    --pure-python uses this code path:

    def is_module_accelerated(module):
        return getattr(pickle.Pickler, '__module__', '<jython>') == 'pickle'

    (...)

            if six.PY3:
                sys.modules['_pickle'] = None
            import pickle
            if not is_module_accelerated(pickle):
                raise RuntimeError("Unexpected C accelerators for pickle")

    Should I remove the benchmark?

    @vstinner
    Copy link
    Member

    I wrote a simple PR, PR 14016, to fix the pure Python implementation of pickle: simply restrict pickle to protocol 4 if PickleBuffer is missing.

    @vstinner vstinner reopened this Jun 12, 2019
    @vstinner
    Copy link
    Member

    New changeset 63ab4ba by Victor Stinner in branch 'master':
    bpo-37210: Fix pure Python pickle when _pickle is unavailable (GH-14016)
    63ab4ba

    @miss-islington
    Copy link
    Contributor

    New changeset cbda40d by Miss Islington (bot) in branch '3.8':
    bpo-37210: Fix pure Python pickle when _pickle is unavailable (GH-14016)
    cbda40d

    @vstinner
    Copy link
    Member

    I pushed my change to 3.8 and master branches. I close the issue.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants