This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: unpickling a subclass of list fails when it implements its own extend method
Type: behavior Stage: needs patch
Components: Extension Modules, Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9, Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: gregory.p.smith, richardlev, serhiy.storchaka
Priority: normal Keywords: 3.7regression

Created on 2021-04-27 00:33 by gregory.p.smith, last changed 2022-04-11 14:59 by admin.

Messages (4)
msg392008 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-04-27 00:33
The changes from https://bugs.python.org/issue29368 are causing a subclass of list trouble:

```
class FieldList(list):
    ...
    def extend(...): ...
```

FieldList has its own extend and append methods that implement additional checks.  As it is a list subclass, the new `PyList_CheckExact()` from the afformentioned issue's https://github.com/python/cpython/commit/f89fdc29937139b55dd68587759cadb8468d0190 where it used to be a `PyList_Check()` in 3.6 and earlier is causing the unpickling code to call the instance `.extend()` method instead of internally using `PyList_SetSlice()` at the C level.

Calling .extend() naturally fails at this point as __setstate__ hasn't yet been called so the FieldList instance is uninitialized.

Here's the code in question https://github.com/google/protorpc/blob/master/protorpc/messages.py#L1126

It used it work.  3.7 broke it.  What was unpicklable is now not.

To work around this logic would be needed in the extend (and append) methods to check if they're being called on an uninitialized instance.  That seems unreasonable.

_[credit to my colleague Richard L. for the diagnosis]_
msg392022 - (view) Author: Richard Levasseur (richardlev) * Date: 2021-04-27 04:00
Here's a self-contained repro:


```
import pickle

class MyList(list):
  def __init__(self, required, values):
    self.required = required
    super().__init__(values)

  def __getstate__(self):
    return self.required

  def __setstate__(self, state):
    self.required = state

  def extend(self, values):
    assert self.required
    super().extend(values)

mylist = MyList('foo', [1, 2])
pickled = pickle.dumps(mylist)
unpickled = pickle.loads(pickled)

print(mylist)

```

The above will raise an AttributeError when self.required is accessed in extend(). 

Oddly, defining a `__reduce__()` function that simply calls and returns `super().__reduce__()` seems to restore the previous behavior and things work again.
msg392027 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2021-04-27 05:29
From PEP 307:

    listitems    Optional, and new in this PEP.
                 If this is not None, it should be an iterator (not a
                 sequence!) yielding successive list items.  These list
                 items will be pickled, and appended to the object using
                 either obj.append(item) or obj.extend(list_of_items).
                 This is primarily used for list subclasses, but may
                 be used by other classes as long as they have append()
                 and extend() methods with the appropriate signature.
                 (Whether append() or extend() is used depends on which
                 pickle protocol version is used as well as the number
                 of items to append, so both must be supported.)

It says that obj.extend(list_of_items) should be called, not list.extend(obj, list_of_items). Changing it now can break classes which rely on overridden extend(). The special check in the C code is only for optimization, we can inline extend() if it was not overridden.

Unfortunately __setstate__() is called after extend(), so there is no way to set the validator in FieldList before it will be used in extend().

As a clever hack (if you do not want to do runtime check in every call of extend()) you can set

    self.extend = lambda items: list.extend(self, items)

in the __new__ method and add

    del self.extend

in __init__ and __setstate__.
msg392100 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2021-04-27 16:05
Thanks, that makes sense.  And it at least less ugly than the __new__ hack I was thinking of as a workaround.  though similar in spirit.

presumably self.append should get the same treatment for completeness given the PEP 307 text.

I didn't notice a place in the Python pickle docs that mentioned this specific thing about listitems and the need for append/extend.  I didn't do a thorough read though, maybe I overlooked something?  If not, it'd be worth figuring out how to get this uninitialized class calling of append/extend/__setitem__ for list and dict objects detailed in the main docs rather than off in the PEP.

It's semi-mentioned https://docs.python.org/3/library/pickle.html#object.__reduce__ here which is what the PEP added I suppose, but given this code has no custom __reduce__, we need to explicitly mention that list and dict subclasses supporting pickling/copying may need to be prepared to handle this situation.  With a versionchanged:: 3.7 noting that it now always applies to list subclasses without their own __reduce__.
History
Date User Action Args
2022-04-11 14:59:44adminsetgithub: 88112
2021-04-27 16:05:29gregory.p.smithsetmessages: + msg392100
2021-04-27 05:29:40serhiy.storchakasetmessages: + msg392027
2021-04-27 04:00:15richardlevsetnosy: + richardlev
messages: + msg392022
2021-04-27 00:33:02gregory.p.smithcreate