Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make pickle generated by Python 3.x compatible with 2.x and vice-versa. #50387

Closed
mkiever mannequin opened this issue May 28, 2009 · 24 comments
Closed

Make pickle generated by Python 3.x compatible with 2.x and vice-versa. #50387

mkiever mannequin opened this issue May 28, 2009 · 24 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@mkiever
Copy link
Mannequin

mkiever mannequin commented May 28, 2009

BPO 6137
Nosy @gvanrossum, @rhettinger, @jcea, @pitrou, @mkiever, @avassalotti
Files
  • compat_pickle.diff
  • compat_pickle2.diff
  • compat_pickle5.diff
  • compat_pickle6.diff
  • compat_pickle7.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2009-06-06.14:22:12.947>
    created_at = <Date 2009-05-28.08:52:35.455>
    labels = ['type-bug', 'library']
    title = 'Make pickle generated by Python 3.x compatible with 2.x and vice-versa.'
    updated_at = <Date 2011-02-21.23:39:35.850>
    user = 'https://github.com/mkiever'

    bugs.python.org fields:

    activity = <Date 2011-02-21.23:39:35.850>
    actor = 'jcea'
    assignee = 'none'
    closed = True
    closed_date = <Date 2009-06-06.14:22:12.947>
    closer = 'pitrou'
    components = ['Library (Lib)']
    creation = <Date 2009-05-28.08:52:35.455>
    creator = 'mkiever'
    dependencies = []
    files = ['14124', '14175', '14179', '14184', '14186']
    hgrepos = []
    issue_num = 6137
    keywords = ['patch']
    message_count = 24.0
    messages = ['88470', '88487', '88589', '88602', '88640', '88684', '88826', '88831', '88836', '88838', '88841', '88861', '88863', '88901', '88902', '88921', '89004', '89153', '89154', '89156', '89157', '89158', '89159', '89160']
    nosy_count = 7.0
    nosy_names = ['gvanrossum', 'rhettinger', 'jcea', 'pitrou', 'mkiever', 'alexandre.vassalotti', 'hagen']
    pr_nums = []
    priority = 'critical'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue6137'
    versions = ['Python 3.1']

    @mkiever
    Copy link
    Mannequin Author

    mkiever mannequin commented May 28, 2009

    Hello,

    while porting something to Python 3.1a1
    I found out that Python 3 cannot load most Python 2 pickles
    of any protocol because copy_reg has been renamed to copyreg.

    Found this comment by Skip Montanaro in related issue:
    http://bugs.python.org/issue3799#msg76196

    Could not find an issue opened for this though.
    So I'm opening one.

    Regards,
    Matthias Kievernagel

    @pitrou
    Copy link
    Member

    pitrou commented May 28, 2009

    bpo-3675 is a similar issue, too bad nothing could be done to solve it...

    @avassalotti
    Copy link
    Member

    If I understood correctly, bpo-3675 is about making pickle data generated
    by Python 3 readable by Python 2. However, this issue is about
    compatibility in the other direction—i.e., making Python 2 pickles
    readable by Python 3, which is easier.

    I have a patch that make Unpickler translates the old names to the new
    ones at runtime. The only thing missing in the patch is the unit tests
    since I am not sure this should be tested. I am thinking using the
    approach Collin Winter used for his compatibility tests in bpo-5665.
    However, his approach requires bidirectional pickle compatibility
    between Python 2 and 3, which we don't have.

    @pitrou
    Copy link
    Member

    pitrou commented May 31, 2009

    If I understood correctly, bpo-3675 is about making pickle data generated
    by Python 3 readable by Python 2.

    Only if a protocol <= 2 is specified. Therefore it seems it's "only" a
    matter of translating module names.

    @mkiever
    Copy link
    Mannequin Author

    mkiever mannequin commented Jun 1, 2009

    Applied the patch
    http://bugs.python.org/file14124/compat_pickle.diff
    to rev. 73106.
    Patch applies fine, 'make test' passes
    and it solves my problem.
    (which is far from a complete test case though

    • only 5 small pickles)

    Thanks,
    Matthias Kievernagel

    @pitrou
    Copy link
    Member

    pitrou commented Jun 1, 2009

    Some comments on the patch:

    • I don't understand why you create a static "twotuple" object rather
      than simply using Py_BuildValue("(OO)", ...). Mutating tuples is bad.
    • I don't think you need to call PyDict_Contains() before
      PyDict_GetItem(). The latter will simply return NULL if the key isn't found.
    • You should check whether the item fetched from the dictionary has the
      right datatype (tuple for name_mapping, str for import_mapping). Anyone
      can change (monkeypatch) _pickle_compat and make the interpreter crash.

    @pitrou pitrou added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jun 1, 2009
    @pitrou
    Copy link
    Member

    pitrou commented Jun 3, 2009

    Unpickling e.g. StringIO objects doesn't seem to work:

    >>> s = pickle.load(open("stringio.pickle", "rb"))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/antoine/py3k/picklecompat-6137/Lib/pickle.py", line 1351,
    in load
        encoding=encoding, errors=errors).load()
    AttributeError: '_io.BytesIO' object has no attribute '__dict__'
    >>>

    @pitrou
    Copy link
    Member

    pitrou commented Jun 3, 2009

    An improved patch with tests.
    It has no tests for fix_imports=False, though.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 3, 2009

    A new patch which also includes reverse mappings, so that protocol <= 2
    pickles created with 3.x can also work under 2.x.
    (that is, it also solves bpo-3675)

    @pitrou
    Copy link
    Member

    pitrou commented Jun 3, 2009

    Updated patch with a couple of documentation and function prototype fixes.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 3, 2009

    Sorry, last patch had a couple of minor issues

    @avassalotti
    Copy link
    Member

    Here is an updated patch based on Antoine's latest patch.

    Summary of changes:

    • Updated docstrings of Pickler and Unpickler in the pickle module.
    • Fixed pickle._Pickler to consider fix_imports only for protocol < 3
    • Made module name remapping in _pickle more robust:
      • added PyUnicode_Check on global_name and module_name;
      • used PyDict_GetItemWithError instead of PyDict_GetItem
    • Changed Py_BuildValue("(OO)", ...) to its faster equivalent
      PyTuple_Pack(2, ...).

    I don't really the idea of remapping names generated by Pickler, since
    it breaks the identity guarantee in save_global(). However, I agree this
    is an example where practicality beats purity. So, I do not oppose to
    the change.

    @avassalotti avassalotti self-assigned this Jun 4, 2009
    @avassalotti avassalotti changed the title Pickle migration: Should pickle map "copy_reg" to "copyreg"? Make pickle generated by Python 3.x compatible with 2.x and vice-versa. Jun 4, 2009
    @hagen
    Copy link
    Mannequin

    hagen mannequin commented Jun 4, 2009

    The latest patch is missing the file "Lib/_compat.pickle.py". (Seems as
    if you forgot to svn add it after patching.)

    @avassalotti
    Copy link
    Member

    Oops. Here is a new patch with _compat_pickle.py.

    @pitrou
    Copy link
    Member

    pitrou commented Jun 4, 2009

    Committed in r73236 in the hope that it gets a bit more testing before
    rc2/final.

    @hagen
    Copy link
    Mannequin

    hagen mannequin commented Jun 5, 2009

    I think it is worth noting that now some Python 3.1 protocol 2 pickles
    can't be read by Python 3.0. We probably don't have to do anything about
    that, but perhaps it should be mentioned somewhere? Docs, release notes?

    @pitrou
    Copy link
    Member

    pitrou commented Jun 6, 2009

    I've added an entry in the what's new file in r73254.

    @pitrou pitrou closed this as completed Jun 6, 2009
    @pitrou
    Copy link
    Member

    pitrou commented Jun 9, 2009

    Guido, Raymond suggested we ask your input on this one, although it's
    obviously a bit late (the patch has been committed).

    @gvanrossum
    Copy link
    Member

    I'm not on IRC. What exactly is your question for me? What is Raymond's
    view?

    @rhettinger
    Copy link
    Contributor

    I vaguely remember you rejecting a proposal along these lines when Brett
    was doing the library renaming.

    The patch (as applied) turns on the renaming automatically when used
    with protocol 2 (i.e. all object names are stored with their 2.x names,
    not their new names). Hagen points out that 3.1 proto 2 pickles can't
    be read by 3.0. I would think that the back-translation should be
    disabled by default or else we're going to live with the 2.x names for a
    very long time.

    I don't care much one way or the other. Just thought you should see it
    before it got released and set in stone. It's not too late for a one
    line change, setting the fix_imports default from True to False.

    FWIW, the argument for leaving the default as True is the theory that
    anyone using protocol 2 is likely doing so to exchange data with 2.x.

    @gvanrossum
    Copy link
    Member

    Ah. How about only doing back-translation when protocol=2 (or lower)
    is explicitly selected?

    I don't much like that 3.0 will be to read pickles written by 3.1 with
    the default protocol (i.e. 3), but I don't mind breaking protocol 2,
    since that's most likely (as you say) intended for Python 2. So the
    default for fix_imports would be None, and the __init__ would check if
    it's None, and then set it to (protocol <= 2).

    @pitrou
    Copy link
    Member

    pitrou commented Jun 9, 2009

    Ah. How about only doing back-translation when protocol=2 (or lower)
    is explicitly selected?

    Well, this is exactly what is implemented!

    I don't much like that 3.0 will be to read pickles written by 3.1 with
    the default protocol (i.e. 3),

    I suppose you meant to say "will be unable", but it isn't, since
    protocol 3 pickles are not affected at all by this patch.

    @gvanrossum
    Copy link
    Member

    On Tue, Jun 9, 2009 at 12:16 PM, Antoine Pitrou<report@bugs.python.org> wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    > Ah. How about only doing back-translation when protocol=2 (or lower)
    > is explicitly selected?

    Well, this is exactly what is implemented!

    Ah, I missed that. Fine then!

    > I don't much like that 3.0 will be to read pickles written by 3.1 with
    > the default protocol (i.e. 3),

    I suppose you meant to say "will be unable",

    Indeed.

    but it isn't, since
    protocol 3 pickles are not affected at all by this patch.

    Good.

    So I'm okay with the status quo -- too bad 3.0 can't read those
    pickles, but that's the deal with abandoning 3.0...

    @rhettinger
    Copy link
    Contributor

    Thanks for taking a look and opining.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants