pickle special methods are looked up on the instance rather than the type #60455

ericsnowcurrently · 2012-10-16T18:24:54Z

BPO	16251
Nosy	@warsaw, @rhettinger, @pitrou, @benjaminp, @ambv, @ericsnowcurrently, @serhiy-storchaka
PRs	bpo-16251: pickle special methods and custom __getattr__ #3508

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/benjaminp'
closed_at = None
created_at = <Date 2012-10-16.18:24:53.984>
labels = ['interpreter-core', 'type-bug', '3.7']
title = 'pickle special methods are looked up on the instance rather than the type'
updated_at = <Date 2018-02-13.16:52:20.309>
user = 'https://github.com/ericsnowcurrently'

bugs.python.org fields:

activity = <Date 2018-02-13.16:52:20.309>
actor = 'lukasz.langa'
assignee = 'benjamin.peterson'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2012-10-16.18:24:53.984>
creator = 'eric.snow'
dependencies = []
files = []
hgrepos = []
issue_num = 16251
keywords = ['patch']
message_count = 22.0
messages = ['173066', '173068', '173083', '173118', '173119', '182258', '191951', '191984', '192265', '208167', '211329', '211331', '211338', '213799', '251340', '251343', '301948', '301962', '301985', '301989', '301990', '312132']
nosy_count = 11.0
nosy_names = ['barry', 'rhettinger', 'pitrou', 'benjamin.peterson', 'hodgestar', 'Arfrever', 'lukasz.langa', 'python-dev', 'eric.snow', 'serhiy.storchaka', 'robagar']
pr_nums = ['3508']
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue16251'
versions = ['Python 3.6', 'Python 3.7']

ericsnowcurrently · 2012-10-16T18:24:54Z

In Objects/typeobject.c, reduce_2() makes _PyObject_GetAttrId() calls to pull some methods for the object in question. This is a problem when the object's type defines __getattr__() or __getattribute__() and returns something like None when the attribute is not found:

from copy import deepcopy

class Spam(dict):
    def __getattr__(self, name):
        # defaults to None
        return self.get(name)

deepcopy(Spam(ham=5))

While we could do a check on the result of _PyObject_GetAttrId() to make sure it is callable, the better solution would be to look up the methods on the object's type rather than on the object. Doing so falls in line with the specified pattern for special method lookup [1]. I'm guessing there are a few other related places in typeobject.c that have the same problem which could be fixed in the same way.

I'm marking this as a bug rather than a feature since it is a deviation from the specification for special methods.

[1] http://docs.python.org/dev/reference/datamodel.html#special-method-lookup

benjaminp · 2012-10-16T18:34:32Z

pickle and _pickle also look on the instance for __reduce__ and __reduce_ex__.

pitrou · 2012-10-16T19:41:36Z

I'm marking this as a bug rather than a feature since it is a deviation
from the specification for special methods.

It may still break some existing code, though (especially for people using proxies).

rhettinger · 2012-10-17T01:00:26Z

It may still break some existing code, though
(especially for people using proxies).

I agree this is a significant risk.
Before any change gets made, it should
be discussed on python-dev (especially
if you intend on backporting the change).

ericsnowcurrently · 2012-10-17T01:08:49Z

Before any change gets made, it should be discussed on python-dev
(especially if you intend on backporting the change).

Agreed.

ericsnowcurrently · 2013-02-17T03:47:46Z

I've posted to python-dev about this:

http://mail.python.org/pipermail/python-dev/2013-February/124114.html

At the very least the pickle docs deserve some mention of the behavior. That way people won't need to spend as much time trying to figure out why things aren't working the way they expect.

However, my preference is to fix pickle, though I readily concede that might not be possible. And just to reiterate, I agree we shouldn't do anything here without a clear picture of the impact.

ericsnowcurrently · 2013-06-27T15:53:54Z

A backward compatible solution would be to do lookup on the class after trying the instance (and that came back None or isn't callable).

pitrou · 2013-06-28T09:43:12Z

At the very least the pickle docs deserve some mention of the
behavior. That way people won't need to spend as much time trying to
figure out why things aren't working the way they expect.

I think you are overreacting. The rule you are talking about (special methods are only looked up on the class) is really known by experts, not so much by common Python programmers. I'd argue that pickle doesn't break normal users' expectations here.

ericsnowcurrently · 2013-07-04T02:14:23Z

I think you are overreacting.

Yeah, maybe so. :) I just found the problem I ran into to be really hard to diagnose due to how pickle does lookup, and figured it would be worth addressing to save someone else the same headache later.

warsaw · 2014-01-15T15:08:05Z

I ran into this recently while porting nose to 3.4. I duped http://bugs.python.org/issue20261 to this issue, but note that in http://bugs.python.org/issue20261#msg208109 I notice that the presence or absence of __getnewargs__() regresses the specific behavior in Python 3.4.

hodgestar · 2014-02-16T16:54:34Z

Genshi is affected by the 3.4 regression too (it has a class that defines __getnewargs__ and __getattr__).

hodgestar · 2014-02-16T17:26:46Z

I have an ugly work-around for those affected:

    def __getattr__(self, name):
        # work around for pickle bug in Python 3.4
        # see http://bugs.python.org/issue16251
        if name == "__getnewargs_ex__":
            raise AttributeError("%r has no attribute %r" % (type(self), name))
        ...

python-dev · 2014-02-16T18:49:29Z

New changeset b328f8ccbccf by Benjamin Peterson in branch 'default':
look up __getnewargs__ and __getnewargs_ex__ on the object type (bpo-16251)
http://hg.python.org/cpython/rev/b328f8ccbccf

python-dev · 2014-03-17T06:30:45Z

New changeset 2514a577c7cb by Benjamin Peterson in branch '3.4':
look up __getnewargs__ and __getnewargs_ex__ on the object type (bpo-16251)
http://hg.python.org/cpython/rev/2514a577c7cb

serhiy-storchaka · 2015-09-22T18:52:22Z

Is this issue fixed?

benjaminp · 2015-09-22T19:19:01Z

I don't think so.

On Tue, Sep 22, 2015, at 11:52, Serhiy Storchaka wrote:

Serhiy Storchaka added the comment:

Is this issue fixed?

----------
nosy: +serhiy.storchaka
status: open -> pending

Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue16251\>

serhiy-storchaka · 2017-09-12T11:05:01Z

What the problem tries to solve PR 3508? Swallowing all exceptions looks like an antipattern to me. This puts the problem under the carpet. Rather than failing and allowing the programmer to fix the picleability of its class, this can silently produce incorrect pickle data. By swallowing MemoryError and RecursionError this changes the behavior of objects in an environment with limited resources: lack of memory or calling deep in the calling stack. This adds heisenbugs. An infinity recursion spends CPU time, and in unlucky cases can cause a crash. It is better to detect it earlier than just swallow RecursionError.

ambv · 2017-09-12T14:09:52Z

What the problem tries to solve PR 3508?

The two test cases added demonstrate what was impossible to pickle before and is now.

Swallowing all exceptions looks like an antipattern to me.

This is the only thing that we can do faced with custom __getattr__() implementations, especially when copy.deepcopy() creates new objects with cls.__new__(), something that most class implementers won't expect.

Rather than failing and allowing the programmer to fix the picleability of its class, this can silently produce incorrect pickle data.

Can you give me an example where this would lead to incorrect pickle data?

By swallowing MemoryError and RecursionError this changes the behavior of objects in an environment with limited resources: lack of memory or calling deep in the calling stack. This adds heisenbugs.

This is what hasattr() in Python 2 did. This is why in Python 2 the RecursionError example I added to the tests was actually working just fine.

serhiy-storchaka · 2017-09-12T17:59:22Z

The two test cases added demonstrate what was impossible to pickle before and is now.

These two classes obviously are not pickleable. Pickling or copying them is a programming error. I believe the proper way of making them pickleable and copyable is implementing corresponding special methods.

Can you give me an example where this would lead to incorrect pickle data?

Any class that needs non-trivial __getstate__(). If pickling is failed now it means that the class is not pickleable. With your patch pickling can be successful, but there is no guarantee that it is correct.

For example, modifying one of your example:

class Proxy:
    def __init__(self, proxied_object):
        self.proxied_object = proxied_object
        self.id = id(proxied_object)
    def __getattr__(self, name):
        return getattr(self.proxied_object, name)

This is what hasattr() in Python 2 did. This is why in Python 2 the RecursionError example I added to the tests was actually working just fine.

hasattr() is broken in Python 2. It was fixed in Python 3. Your patch reintroduces similar bug in copy.

ambv · 2017-09-12T18:34:17Z

These two classes obviously are not pickleable. Pickling or copying them is a programming error.

The recursion case was successfully deep copying under Python 2. When migrating the code, the sudden recursion error is pretty hard to debug.

I believe the proper way of making them pickleable and copyable is implementing corresponding special methods.

Right, but during migration from Python 2 to Python 3 the error message is nowhere close suggesting what the correct behavior is.

> Can you give me an example where this would lead to incorrect pickle data?
Any class that needs non-trivial __getstate__().

My patch doesn't change anything here. It simply causes special methods to be reported as not present instead of raising exceptions from within __getattr__.

hasattr() is broken in Python 2. It was fixed in Python 3.

Yes, I agree.

Your patch reintroduces similar bug in copy.

No, my patch introduces a getattr() equivalent for deepcopy which is robust against random exceptions raised from __getattr__.

------------

Let's take a step back. My goal with this patch is two-fold:

make code that worked on Python 2 (even if clowny) not fail on Python 3, adding to the migration burden;
make a change that is not changing semantics of the current behavior to not break existing code deliberately putting magic methods on instances (hence I'm not switching to the right behavior to test presence of magic methdos on types instead).

Clearly the current workaround I wrote doesn't go in the direction you'd like to see. Do you have a suggestion of an approach that would be better? I can implement it.

serhiy-storchaka · 2017-09-12T19:45:40Z

The recursion case was successfully deep copying under Python 2. When
migrating the code, the sudden recursion error is pretty hard to debug.

Python 3 is not compatible with Python 2 because it fixes many bugs that can't
be fixed without breaking backward compatibility. This is a one of such cases.

The problem with copying is not the only problem with that class. It also has
problems when the garbage collector breaks reference loops and sets instance
attributes to None. The idiomatic ways of solving these problems is accessing
to proxied_object as self.__dict__['proxied_object'] or setting a class
attribute proxied_object = None. We had fixed a number of similar issues in the
stdlib.

Right, but during migration from Python 2 to Python 3 the error message is
nowhere close suggesting what the correct behavior is.

I think a traceback should help to identify the problem. At least it looks
pretty clear to me.

My patch doesn't change anything here. It simply causes special methods to
be reported as not present instead of raising exceptions from within
__getattr__.

And this is a regression change. Instead of not producing incorrect data, the
code now poduces it.

Yet one drawback is a performance regression. Additional checks slow down a
copying.

> Your patch reintroduces similar bug in copy.
No, my patch introduces a getattr() equivalent for deepcopy which is robust
against random exceptions raised from __getattr__.

The code shouldn't be tolerant to random exceptions raised from __getattr__.
Exceptions signal about a bug in user code.

make code that worked on Python 2 (even if clowny) not fail on Python 3,
adding to the migration burden;

Raising an exception is a feature of Python 3, not a bug. It helps to catch
programming errors in user code.

Clearly the current workaround I wrote doesn't go in the direction you'd
like to see. Do you have a suggestion of an approach that would be better?
I can implement it.

The current behavior LGTM. I don't think it needs to be improved.

ambv · 2018-02-13T16:52:20Z

FWIW I withdrew my PR.

ericsnowcurrently added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Oct 16, 2012

ericsnowcurrently changed the title ~~object.__reduce__()~~ some special methods are looked up on the instance rather than the type Oct 16, 2012

benjaminp changed the title ~~some special methods are looked up on the instance rather than the type~~ pickle special methods are looked up on the instance rather than the type Oct 16, 2012

ericsnowcurrently self-assigned this Jun 25, 2013

benjaminp assigned benjaminp and unassigned ericsnowcurrently Feb 16, 2014

ambv added the 3.7 (EOL) end of life label Sep 12, 2017

ezio-melotti transferred this issue from another repository Apr 10, 2022

iritkatriel added 3.11 only security fixes 3.12 bugs and security fixes 3.13 bugs and security fixes and removed 3.7 (EOL) end of life labels Oct 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pickle special methods are looked up on the instance rather than the type #60455

pickle special methods are looked up on the instance rather than the type #60455

ericsnowcurrently commented Oct 16, 2012

ericsnowcurrently commented Oct 16, 2012

benjaminp commented Oct 16, 2012

pitrou commented Oct 16, 2012

rhettinger commented Oct 17, 2012

ericsnowcurrently commented Oct 17, 2012

ericsnowcurrently commented Feb 17, 2013

ericsnowcurrently commented Jun 27, 2013

pitrou commented Jun 28, 2013

ericsnowcurrently commented Jul 4, 2013

warsaw commented Jan 15, 2014

hodgestar mannequin commented Feb 16, 2014

hodgestar mannequin commented Feb 16, 2014

python-dev mannequin commented Feb 16, 2014

python-dev mannequin commented Mar 17, 2014

serhiy-storchaka commented Sep 22, 2015

benjaminp commented Sep 22, 2015

serhiy-storchaka commented Sep 12, 2017

ambv commented Sep 12, 2017

serhiy-storchaka commented Sep 12, 2017

ambv commented Sep 12, 2017

serhiy-storchaka commented Sep 12, 2017

ambv commented Feb 13, 2018

pickle special methods are looked up on the instance rather than the type #60455

pickle special methods are looked up on the instance rather than the type #60455

Comments

ericsnowcurrently commented Oct 16, 2012

ericsnowcurrently commented Oct 16, 2012

benjaminp commented Oct 16, 2012

pitrou commented Oct 16, 2012

rhettinger commented Oct 17, 2012

ericsnowcurrently commented Oct 17, 2012

ericsnowcurrently commented Feb 17, 2013

ericsnowcurrently commented Jun 27, 2013

pitrou commented Jun 28, 2013

ericsnowcurrently commented Jul 4, 2013

warsaw commented Jan 15, 2014

hodgestar mannequin commented Feb 16, 2014

hodgestar mannequin commented Feb 16, 2014

python-dev mannequin commented Feb 16, 2014

python-dev mannequin commented Mar 17, 2014

serhiy-storchaka commented Sep 22, 2015

benjaminp commented Sep 22, 2015

serhiy-storchaka commented Sep 12, 2017

ambv commented Sep 12, 2017

serhiy-storchaka commented Sep 12, 2017

ambv commented Sep 12, 2017

serhiy-storchaka commented Sep 12, 2017

ambv commented Feb 13, 2018