classification
Title: pickle special methods are looked up on the instance rather than the type
Type: behavior Stage: needs patch
Components: Interpreter Core Versions: Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: benjamin.peterson Nosy List: Arfrever, barry, benjamin.peterson, eric.snow, hodgestar, lukasz.langa, pitrou, python-dev, rhettinger, robagar, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2012-10-16 18:24 by eric.snow, last changed 2017-09-12 19:45 by serhiy.storchaka.

Pull Requests
URL Status Linked Edit
PR 3508 open lukasz.langa, 2017-09-12 03:00
Messages (21)
msg173066 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2012-10-16 18:24
In Objects/typeobject.c, reduce_2() makes _PyObject_GetAttrId() calls to pull some methods for the object in question.  This is a problem when the object's type defines __getattr__() or __getattribute__() and returns something like None when the attribute is not found:


from copy import deepcopy

class Spam(dict):
    def __getattr__(self, name):
        # defaults to None
        return self.get(name)

deepcopy(Spam(ham=5))


While we could do a check on the result of _PyObject_GetAttrId() to make sure it is callable, the better solution would be to look up the methods on the object's type rather than on the object.  Doing so falls in line with the specified pattern for special method lookup [1].  I'm guessing there are a few other related places in typeobject.c that have the same problem which could be fixed in the same way.

I'm marking this as a bug rather than a feature since it is a deviation from the specification for special methods.

[1] http://docs.python.org/dev/reference/datamodel.html#special-method-lookup
msg173068 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2012-10-16 18:34
pickle and _pickle also look on the instance for __reduce__ and __reduce_ex__.
msg173083 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-10-16 19:41
> I'm marking this as a bug rather than a feature since it is a deviation 
> from the specification for special methods.

It may still break some existing code, though (especially for people using proxies).
msg173118 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012-10-17 01:00
> It may still break some existing code, though
> (especially for people using proxies).

I agree this is a significant risk.
Before any change gets made, it should
be discussed on python-dev (especially
if you intend on backporting the change).
msg173119 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2012-10-17 01:08
> Before any change gets made, it should be discussed on python-dev
> (especially if you intend on backporting the change).

Agreed.
msg182258 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-02-17 03:47
I've posted to python-dev about this:

http://mail.python.org/pipermail/python-dev/2013-February/124114.html

At the very least the pickle docs deserve some mention of the behavior.  That way people won't need to spend as much time trying to figure out why things aren't working the way they expect.

However, my preference is to fix pickle, though I readily concede that might not be possible.  And just to reiterate, I agree we shouldn't do anything here without a clear picture of the impact.
msg191951 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-06-27 15:53
A backward compatible solution would be to do lookup on the class after trying the instance (and that came back None or isn't callable).
msg191984 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-06-28 09:43
> At the very least the pickle docs deserve some mention of the 
> behavior.  That way people won't need to spend as much time trying to
> figure out why things aren't working the way they expect.

I think you are overreacting. The rule you are talking about (special methods are only looked up on the class) is really known by experts, not so much by common Python programmers. I'd argue that pickle doesn't break normal users' expectations here.
msg192265 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2013-07-04 02:14
> I think you are overreacting.

Yeah, maybe so. :)  I just found the problem I ran into to be really hard to diagnose due to how pickle does lookup, and figured it would be worth addressing to save someone else the same headache later.
msg208167 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2014-01-15 15:08
I ran into this recently while porting nose to 3.4.  I duped http://bugs.python.org/issue20261 to this issue, but note that in http://bugs.python.org/issue20261#msg208109 I notice that the presence or absence of __getnewargs__() regresses the specific behavior in Python 3.4.
msg211329 - (view) Author: Simon Cross (hodgestar) Date: 2014-02-16 16:54
Genshi is affected by the 3.4 regression too (it has a class that defines __getnewargs__ and __getattr__).
msg211331 - (view) Author: Simon Cross (hodgestar) Date: 2014-02-16 17:26
I have an ugly work-around for those affected:

    def __getattr__(self, name):
        # work around for pickle bug in Python 3.4
        # see http://bugs.python.org/issue16251
        if name == "__getnewargs_ex__":
            raise AttributeError("%r has no attribute %r" % (type(self), name))
        ...
msg211338 - (view) Author: Roundup Robot (python-dev) Date: 2014-02-16 18:49
New changeset b328f8ccbccf by Benjamin Peterson in branch 'default':
look up __getnewargs__ and __getnewargs_ex__ on the object type (#16251)
http://hg.python.org/cpython/rev/b328f8ccbccf
msg213799 - (view) Author: Roundup Robot (python-dev) Date: 2014-03-17 06:30
New changeset 2514a577c7cb by Benjamin Peterson in branch '3.4':
look up __getnewargs__ and __getnewargs_ex__ on the object type (#16251)
http://hg.python.org/cpython/rev/2514a577c7cb
msg251340 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-09-22 18:52
Is this issue fixed?
msg251343 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2015-09-22 19:19
I don't think so.

On Tue, Sep 22, 2015, at 11:52, Serhiy Storchaka wrote:
> 
> Serhiy Storchaka added the comment:
> 
> Is this issue fixed?
> 
> ----------
> nosy: +serhiy.storchaka
> status: open -> pending
> 
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue16251>
> _______________________________________
msg301948 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-12 11:05
What the problem tries to solve PR 3508? Swallowing all exceptions looks like an antipattern to me. This puts the problem under the carpet. Rather than failing and allowing the programmer to fix the picleability of its class, this can silently produce incorrect pickle data. By swallowing MemoryError and RecursionError this changes the behavior of objects in an environment with limited resources: lack of memory or calling deep in the calling stack. This adds heisenbugs. An infinity recursion spends CPU time, and in unlucky cases can cause a crash. It is better to detect it earlier than just swallow RecursionError.
msg301962 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2017-09-12 14:09
> What the problem tries to solve PR 3508?

The two test cases added demonstrate what was impossible to pickle before and is now.

> Swallowing all exceptions looks like an antipattern to me.

This is the only thing that we can do faced with custom `__getattr__()` implementations, especially when `copy.deepcopy()` creates new objects with `cls.__new__()`, something that most class implementers won't expect.

> Rather than failing and allowing the programmer to fix the picleability of its class, this can silently produce incorrect pickle data.

Can you give me an example where this would lead to incorrect pickle data?

> By swallowing MemoryError and RecursionError this changes the behavior of objects in an environment with limited resources: lack of memory or calling deep in the calling stack. This adds heisenbugs.

This is what `hasattr()` in Python 2 did.  This is why in Python 2 the `RecursionError` example I added to the tests was actually working just fine.
msg301985 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-12 17:59
> The two test cases added demonstrate what was impossible to pickle before and is now.

These two classes obviously are not pickleable. Pickling or copying them is a programming error. I believe the proper way of making them pickleable and copyable is implementing corresponding special methods.

> Can you give me an example where this would lead to incorrect pickle data?

Any class that needs non-trivial __getstate__(). If pickling is failed now it means that the class is not pickleable. With your patch pickling can be successful, but there is no guarantee that it is correct.

For example, modifying one of your example:

class Proxy:
    def __init__(self, proxied_object):
        self.proxied_object = proxied_object
        self.id = id(proxied_object)
    def __getattr__(self, name):
        return getattr(self.proxied_object, name)

> This is what `hasattr()` in Python 2 did.  This is why in Python 2 the `RecursionError` example I added to the tests was actually working just fine.

hasattr() is broken in Python 2. It was fixed in Python 3. Your patch reintroduces similar bug in copy.
msg301989 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2017-09-12 18:34
> These two classes obviously are not pickleable. Pickling or copying them is a programming error.

The recursion case was successfully deep copying under Python 2. When migrating the code, the sudden recursion error is pretty hard to debug.


> I believe the proper way of making them pickleable and copyable is implementing corresponding special methods.

Right, but during migration from Python 2 to Python 3 the error message is nowhere close suggesting what the correct behavior is.


>> Can you give me an example where this would lead to incorrect pickle data?
> Any class that needs non-trivial __getstate__().

My patch doesn't change anything here. It simply causes special methods to be reported as not present instead of raising exceptions from within __getattr__.


> hasattr() is broken in Python 2. It was fixed in Python 3.

Yes, I agree.


> Your patch reintroduces similar bug in copy.

No, my patch introduces a getattr() equivalent for deepcopy which is robust against random exceptions raised from __getattr__.


------------


Let's take a step back.  My goal with this patch is two-fold:

* make code that worked on Python 2 (even if clowny) not fail on Python 3, adding to the migration burden;

* make a change that is not changing semantics of the current behavior to not break existing code deliberately putting magic methods on instances (hence I'm not switching to the right behavior to test presence of magic methdos on types instead).


Clearly the current workaround I wrote doesn't go in the direction you'd like to see.  Do you have a suggestion of an approach that would be better?  I can implement it.
msg301990 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-09-12 19:45
> The recursion case was successfully deep copying under Python 2. When
> migrating the code, the sudden recursion error is pretty hard to debug.

Python 3 is not compatible with Python 2 because it fixes many bugs that can't 
be fixed without breaking backward compatibility. This is a one of such cases.

The problem with copying is not the only problem with that class. It also has 
problems when the garbage collector breaks reference loops and sets instance 
attributes to None. The idiomatic ways of solving these problems is accessing 
to proxied_object as self.__dict__['proxied_object'] or setting a class 
attribute proxied_object = None. We had fixed a number of similar issues in the 
stdlib.

> Right, but during migration from Python 2 to Python 3 the error message is
> nowhere close suggesting what the correct behavior is.

I think a traceback should help to identify the problem. At least it looks 
pretty clear to me.

> My patch doesn't change anything here. It simply causes special methods to
> be reported as not present instead of raising exceptions from within
> __getattr__.

And this is a regression change. Instead of not producing incorrect data, the 
code now poduces it.

Yet one drawback is a performance regression. Additional checks slow down a 
copying.

> > Your patch reintroduces similar bug in copy.
> No, my patch introduces a getattr() equivalent for deepcopy which is robust
> against random exceptions raised from __getattr__.

The code shouldn't be tolerant to random exceptions raised from __getattr__. 
Exceptions signal about a bug in user code.

> * make code that worked on Python 2 (even if clowny) not fail on Python 3,
> adding to the migration burden;

Raising an exception is a feature of Python 3, not a bug. It helps to catch 
programming errors in user code.

> Clearly the current workaround I wrote doesn't go in the direction you'd
> like to see.  Do you have a suggestion of an approach that would be better?
>  I can implement it.

The current behavior LGTM. I don't think it needs to be improved.
History
Date User Action Args
2017-09-12 19:45:40serhiy.storchakasetmessages: + msg301990
2017-09-12 18:34:17lukasz.langasetmessages: + msg301989
2017-09-12 17:59:22serhiy.storchakasetmessages: + msg301985
2017-09-12 14:10:01lukasz.langasetversions: + Python 3.6, Python 3.7, - Python 3.5
2017-09-12 14:09:52lukasz.langasetnosy: + lukasz.langa

messages: + msg301962
stage: patch review -> needs patch
2017-09-12 11:05:00serhiy.storchakasetmessages: + msg301948
2017-09-12 03:00:58lukasz.langasetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request3503
2017-06-20 18:09:37serhiy.storchakalinkissue30702 superseder
2015-12-02 12:58:08robagarsetnosy: + robagar
2015-09-22 19:19:01benjamin.petersonsetstatus: pending -> open

messages: + msg251343
2015-09-22 18:52:22serhiy.storchakasetstatus: open -> pending
nosy: + serhiy.storchaka
messages: + msg251340

2014-03-17 06:30:44python-devsetmessages: + msg213799
2014-02-16 18:51:31benjamin.petersonsetassignee: eric.snow -> benjamin.peterson
versions: + Python 3.5, - Python 2.7, Python 3.2, Python 3.3, Python 3.4
2014-02-16 18:50:22benjamin.petersonunlinkissue20261 superseder
2014-02-16 18:49:29python-devsetnosy: + python-dev
messages: + msg211338
2014-02-16 17:26:45hodgestarsetmessages: + msg211331
2014-02-16 16:54:34hodgestarsetnosy: + hodgestar
messages: + msg211329
2014-01-15 15:08:05barrysetmessages: + msg208167
2014-01-15 15:06:37barrylinkissue20261 superseder
2014-01-15 15:00:59barrysetnosy: + barry
2014-01-15 06:22:42Arfreversetnosy: + Arfrever
2013-10-25 02:39:12eric.snowlinkissue19364 superseder
2013-07-04 02:14:23eric.snowsetmessages: + msg192265
2013-06-28 09:43:11pitrousetmessages: + msg191984
2013-06-27 15:53:54eric.snowsetmessages: + msg191951
2013-06-25 05:27:24eric.snowsetassignee: eric.snow
2013-02-17 03:47:46eric.snowsetmessages: + msg182258
2012-10-17 01:08:49eric.snowsetmessages: + msg173119
2012-10-17 01:00:26rhettingersetnosy: + rhettinger
messages: + msg173118
2012-10-16 19:41:36pitrousetnosy: + pitrou
messages: + msg173083
2012-10-16 19:19:01benjamin.petersonsettitle: some special methods are looked up on the instance rather than the type -> pickle special methods are looked up on the instance rather than the type
2012-10-16 18:38:08eric.snowsettitle: object.__reduce__() -> some special methods are looked up on the instance rather than the type
2012-10-16 18:34:31benjamin.petersonsetnosy: + benjamin.peterson
messages: + msg173068
2012-10-16 18:24:53eric.snowcreate