Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pickle special methods are looked up on the instance rather than the type #60455

Open
ericsnowcurrently opened this issue Oct 16, 2012 · 22 comments
Assignees
Labels
3.11 only security fixes 3.12 bugs and security fixes 3.13 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@ericsnowcurrently
Copy link
Member

BPO 16251
Nosy @warsaw, @rhettinger, @pitrou, @benjaminp, @ambv, @ericsnowcurrently, @serhiy-storchaka
PRs
  • bpo-16251: pickle special methods and custom __getattr__ #3508
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/benjaminp'
    closed_at = None
    created_at = <Date 2012-10-16.18:24:53.984>
    labels = ['interpreter-core', 'type-bug', '3.7']
    title = 'pickle special methods are looked up on the instance rather than the type'
    updated_at = <Date 2018-02-13.16:52:20.309>
    user = 'https://github.com/ericsnowcurrently'

    bugs.python.org fields:

    activity = <Date 2018-02-13.16:52:20.309>
    actor = 'lukasz.langa'
    assignee = 'benjamin.peterson'
    closed = False
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2012-10-16.18:24:53.984>
    creator = 'eric.snow'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 16251
    keywords = ['patch']
    message_count = 22.0
    messages = ['173066', '173068', '173083', '173118', '173119', '182258', '191951', '191984', '192265', '208167', '211329', '211331', '211338', '213799', '251340', '251343', '301948', '301962', '301985', '301989', '301990', '312132']
    nosy_count = 11.0
    nosy_names = ['barry', 'rhettinger', 'pitrou', 'benjamin.peterson', 'hodgestar', 'Arfrever', 'lukasz.langa', 'python-dev', 'eric.snow', 'serhiy.storchaka', 'robagar']
    pr_nums = ['3508']
    priority = 'normal'
    resolution = None
    stage = 'needs patch'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue16251'
    versions = ['Python 3.6', 'Python 3.7']

    @ericsnowcurrently
    Copy link
    Member Author

    In Objects/typeobject.c, reduce_2() makes _PyObject_GetAttrId() calls to pull some methods for the object in question. This is a problem when the object's type defines __getattr__() or __getattribute__() and returns something like None when the attribute is not found:

    from copy import deepcopy
    
    class Spam(dict):
        def __getattr__(self, name):
            # defaults to None
            return self.get(name)
    
    deepcopy(Spam(ham=5))

    While we could do a check on the result of _PyObject_GetAttrId() to make sure it is callable, the better solution would be to look up the methods on the object's type rather than on the object. Doing so falls in line with the specified pattern for special method lookup [1]. I'm guessing there are a few other related places in typeobject.c that have the same problem which could be fixed in the same way.

    I'm marking this as a bug rather than a feature since it is a deviation from the specification for special methods.

    [1] http://docs.python.org/dev/reference/datamodel.html#special-method-lookup

    @ericsnowcurrently ericsnowcurrently added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Oct 16, 2012
    @benjaminp
    Copy link
    Contributor

    pickle and _pickle also look on the instance for __reduce__ and __reduce_ex__.

    @ericsnowcurrently ericsnowcurrently changed the title object.__reduce__() some special methods are looked up on the instance rather than the type Oct 16, 2012
    @benjaminp benjaminp changed the title some special methods are looked up on the instance rather than the type pickle special methods are looked up on the instance rather than the type Oct 16, 2012
    @pitrou
    Copy link
    Member

    pitrou commented Oct 16, 2012

    I'm marking this as a bug rather than a feature since it is a deviation
    from the specification for special methods.

    It may still break some existing code, though (especially for people using proxies).

    @rhettinger
    Copy link
    Contributor

    It may still break some existing code, though
    (especially for people using proxies).

    I agree this is a significant risk.
    Before any change gets made, it should
    be discussed on python-dev (especially
    if you intend on backporting the change).

    @ericsnowcurrently
    Copy link
    Member Author

    Before any change gets made, it should be discussed on python-dev
    (especially if you intend on backporting the change).

    Agreed.

    @ericsnowcurrently
    Copy link
    Member Author

    I've posted to python-dev about this:

    http://mail.python.org/pipermail/python-dev/2013-February/124114.html

    At the very least the pickle docs deserve some mention of the behavior. That way people won't need to spend as much time trying to figure out why things aren't working the way they expect.

    However, my preference is to fix pickle, though I readily concede that might not be possible. And just to reiterate, I agree we shouldn't do anything here without a clear picture of the impact.

    @ericsnowcurrently ericsnowcurrently self-assigned this Jun 25, 2013
    @ericsnowcurrently
    Copy link
    Member Author

    A backward compatible solution would be to do lookup on the class after trying the instance (and that came back None or isn't callable).

    @pitrou
    Copy link
    Member

    pitrou commented Jun 28, 2013

    At the very least the pickle docs deserve some mention of the
    behavior. That way people won't need to spend as much time trying to
    figure out why things aren't working the way they expect.

    I think you are overreacting. The rule you are talking about (special methods are only looked up on the class) is really known by experts, not so much by common Python programmers. I'd argue that pickle doesn't break normal users' expectations here.

    @ericsnowcurrently
    Copy link
    Member Author

    I think you are overreacting.

    Yeah, maybe so. :) I just found the problem I ran into to be really hard to diagnose due to how pickle does lookup, and figured it would be worth addressing to save someone else the same headache later.

    @warsaw
    Copy link
    Member

    warsaw commented Jan 15, 2014

    I ran into this recently while porting nose to 3.4. I duped http://bugs.python.org/issue20261 to this issue, but note that in http://bugs.python.org/issue20261#msg208109 I notice that the presence or absence of __getnewargs__() regresses the specific behavior in Python 3.4.

    @hodgestar
    Copy link
    Mannequin

    hodgestar mannequin commented Feb 16, 2014

    Genshi is affected by the 3.4 regression too (it has a class that defines __getnewargs__ and __getattr__).

    @hodgestar
    Copy link
    Mannequin

    hodgestar mannequin commented Feb 16, 2014

    I have an ugly work-around for those affected:

        def __getattr__(self, name):
            # work around for pickle bug in Python 3.4
            # see http://bugs.python.org/issue16251
            if name == "__getnewargs_ex__":
                raise AttributeError("%r has no attribute %r" % (type(self), name))
            ...

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 16, 2014

    New changeset b328f8ccbccf by Benjamin Peterson in branch 'default':
    look up __getnewargs__ and __getnewargs_ex__ on the object type (bpo-16251)
    http://hg.python.org/cpython/rev/b328f8ccbccf

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 17, 2014

    New changeset 2514a577c7cb by Benjamin Peterson in branch '3.4':
    look up __getnewargs__ and __getnewargs_ex__ on the object type (bpo-16251)
    http://hg.python.org/cpython/rev/2514a577c7cb

    @serhiy-storchaka
    Copy link
    Member

    Is this issue fixed?

    @benjaminp
    Copy link
    Contributor

    I don't think so.

    On Tue, Sep 22, 2015, at 11:52, Serhiy Storchaka wrote:

    Serhiy Storchaka added the comment:

    Is this issue fixed?

    ----------
    nosy: +serhiy.storchaka
    status: open -> pending


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue16251\>


    @serhiy-storchaka
    Copy link
    Member

    What the problem tries to solve PR 3508? Swallowing all exceptions looks like an antipattern to me. This puts the problem under the carpet. Rather than failing and allowing the programmer to fix the picleability of its class, this can silently produce incorrect pickle data. By swallowing MemoryError and RecursionError this changes the behavior of objects in an environment with limited resources: lack of memory or calling deep in the calling stack. This adds heisenbugs. An infinity recursion spends CPU time, and in unlucky cases can cause a crash. It is better to detect it earlier than just swallow RecursionError.

    @ambv
    Copy link
    Contributor

    ambv commented Sep 12, 2017

    What the problem tries to solve PR 3508?

    The two test cases added demonstrate what was impossible to pickle before and is now.

    Swallowing all exceptions looks like an antipattern to me.

    This is the only thing that we can do faced with custom __getattr__() implementations, especially when copy.deepcopy() creates new objects with cls.__new__(), something that most class implementers won't expect.

    Rather than failing and allowing the programmer to fix the picleability of its class, this can silently produce incorrect pickle data.

    Can you give me an example where this would lead to incorrect pickle data?

    By swallowing MemoryError and RecursionError this changes the behavior of objects in an environment with limited resources: lack of memory or calling deep in the calling stack. This adds heisenbugs.

    This is what hasattr() in Python 2 did. This is why in Python 2 the RecursionError example I added to the tests was actually working just fine.

    @ambv ambv added the 3.7 (EOL) end of life label Sep 12, 2017
    @serhiy-storchaka
    Copy link
    Member

    The two test cases added demonstrate what was impossible to pickle before and is now.

    These two classes obviously are not pickleable. Pickling or copying them is a programming error. I believe the proper way of making them pickleable and copyable is implementing corresponding special methods.

    Can you give me an example where this would lead to incorrect pickle data?

    Any class that needs non-trivial __getstate__(). If pickling is failed now it means that the class is not pickleable. With your patch pickling can be successful, but there is no guarantee that it is correct.

    For example, modifying one of your example:

    class Proxy:
        def __init__(self, proxied_object):
            self.proxied_object = proxied_object
            self.id = id(proxied_object)
        def __getattr__(self, name):
            return getattr(self.proxied_object, name)

    This is what hasattr() in Python 2 did. This is why in Python 2 the RecursionError example I added to the tests was actually working just fine.

    hasattr() is broken in Python 2. It was fixed in Python 3. Your patch reintroduces similar bug in copy.

    @ambv
    Copy link
    Contributor

    ambv commented Sep 12, 2017

    These two classes obviously are not pickleable. Pickling or copying them is a programming error.

    The recursion case was successfully deep copying under Python 2. When migrating the code, the sudden recursion error is pretty hard to debug.

    I believe the proper way of making them pickleable and copyable is implementing corresponding special methods.

    Right, but during migration from Python 2 to Python 3 the error message is nowhere close suggesting what the correct behavior is.

    > Can you give me an example where this would lead to incorrect pickle data?
    Any class that needs non-trivial __getstate__().

    My patch doesn't change anything here. It simply causes special methods to be reported as not present instead of raising exceptions from within __getattr__.

    hasattr() is broken in Python 2. It was fixed in Python 3.

    Yes, I agree.

    Your patch reintroduces similar bug in copy.

    No, my patch introduces a getattr() equivalent for deepcopy which is robust against random exceptions raised from __getattr__.

    ------------

    Let's take a step back. My goal with this patch is two-fold:

    • make code that worked on Python 2 (even if clowny) not fail on Python 3, adding to the migration burden;

    • make a change that is not changing semantics of the current behavior to not break existing code deliberately putting magic methods on instances (hence I'm not switching to the right behavior to test presence of magic methdos on types instead).

    Clearly the current workaround I wrote doesn't go in the direction you'd like to see. Do you have a suggestion of an approach that would be better? I can implement it.

    @serhiy-storchaka
    Copy link
    Member

    The recursion case was successfully deep copying under Python 2. When
    migrating the code, the sudden recursion error is pretty hard to debug.

    Python 3 is not compatible with Python 2 because it fixes many bugs that can't
    be fixed without breaking backward compatibility. This is a one of such cases.

    The problem with copying is not the only problem with that class. It also has
    problems when the garbage collector breaks reference loops and sets instance
    attributes to None. The idiomatic ways of solving these problems is accessing
    to proxied_object as self.__dict__['proxied_object'] or setting a class
    attribute proxied_object = None. We had fixed a number of similar issues in the
    stdlib.

    Right, but during migration from Python 2 to Python 3 the error message is
    nowhere close suggesting what the correct behavior is.

    I think a traceback should help to identify the problem. At least it looks
    pretty clear to me.

    My patch doesn't change anything here. It simply causes special methods to
    be reported as not present instead of raising exceptions from within
    __getattr__.

    And this is a regression change. Instead of not producing incorrect data, the
    code now poduces it.

    Yet one drawback is a performance regression. Additional checks slow down a
    copying.

    > Your patch reintroduces similar bug in copy.
    No, my patch introduces a getattr() equivalent for deepcopy which is robust
    against random exceptions raised from __getattr__.

    The code shouldn't be tolerant to random exceptions raised from __getattr__.
    Exceptions signal about a bug in user code.

    • make code that worked on Python 2 (even if clowny) not fail on Python 3,
      adding to the migration burden;

    Raising an exception is a feature of Python 3, not a bug. It helps to catch
    programming errors in user code.

    Clearly the current workaround I wrote doesn't go in the direction you'd
    like to see. Do you have a suggestion of an approach that would be better?
    I can implement it.

    The current behavior LGTM. I don't think it needs to be improved.

    @ambv
    Copy link
    Contributor

    ambv commented Feb 13, 2018

    FWIW I withdrew my PR.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added 3.11 only security fixes 3.12 bugs and security fixes 3.13 bugs and security fixes and removed 3.7 (EOL) end of life labels Oct 28, 2023
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes 3.12 bugs and security fixes 3.13 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    Status: No status
    Development

    No branches or pull requests

    8 participants