classification
Title: bytes.__getnewargs__ is broken; copy.copy() therefore doesn't work on bytes, and bytes subclasses can't be pickled by default
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: alexandre.vassalotti, r.david.murray, sh
Priority: high Keywords: easy, patch

Created on 2009-11-23 15:17 by sh, last changed 2010-01-12 01:36 by alexandre.vassalotti. This issue is now closed.

Files
File name Uploaded Description Edit
pickle_bytes_subclass.py sh, 2009-11-23 15:17 Problem example: Python3.[01] will throw a TypeError when unpickling a pickled instance of a trivial bytes subclass
fix_bytes_reduce.diff alexandre.vassalotti, 2009-11-24 17:42
Messages (4)
msg95632 - (view) Author: Sebastian Hagen (sh) Date: 2009-11-23 15:17
In either python 3.0, bytes instances cannot be copied, and (even
trivial) bytes subclasses cannot be unpickled unless they explicitly
override __getnewargs__() or __reduce_ex__().

Copy problem:
>>> import copy; copy.copy(b'foo')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.1/copy.py", line 96, in copy
    return _reconstruct(x, rv, 0)
  File "/usr/lib/python3.1/copy.py", line 280, in _reconstruct
    y = callable(*args)
  File "/usr/lib/python3.1/copyreg.py", line 88, in __newobj__
    return cls.__new__(cls, *args)
TypeError: string argument without an encoding

Bytes subclass unpickle problem:
>>> class B(bytes):
...  pass
...
>>> import pickle; pickle.loads(pickle.dumps(B(b'foo')))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.1/pickle.py", line 1373, in loads
    encoding=encoding, errors=errors).load()
TypeError: string argument without an encoding


AFAICT, the problem is that bytes.__getnewargs__() returns a tuple with
a single argument - a string - and bytes.__new__() refuses to
reconstruct the instance when called with in that manner. That is,
"bytes.__new__(bytes, *b'foo'.__getnewargs__())" fails with a TypeError.

This does not cause a problem for pickling bytes instances (as opposed
to instances of a subclass of bytes), because both the Python and C
versions of pickle shipped with Python 3.[01] have built-in magic
(_Pickler.save_bytes() and save_bytes(), respectively) to deal with
bytes instances, and therefore never call their __getnewargs__().

The pickle case, in particular, is highly irritating; the error message
doesn't indicate which object is causing the problem, and until you
actually try to load the pickle, there's nothing to indicate that
there's anything problematic about pickling an instance of a subclass of
bytes.
msg95634 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2009-11-23 15:56
Confirmed on py3k trunk.  We no longer do bug fixes in 3.0, which is why
I'm removing it from versions.
msg95690 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2009-11-24 17:42
We just need make __getnewargs__ return bytes, instead of a unicode
string. So this is a single character fix.

I think we should reuse the ByteArraySubclass test case in test_bytes.py
to test for this bug. Incidentally, the reduce method of bytearray
should also be changed to emit bytes instead of a unicode string.
msg97617 - (view) Author: Alexandre Vassalotti (alexandre.vassalotti) * (Python committer) Date: 2010-01-12 01:23
Committed in r77437. Thanks!
History
Date User Action Args
2010-01-12 01:36:19alexandre.vassalottisetstatus: open -> closed
resolution: accepted
stage: test needed -> resolved
2010-01-12 01:23:34alexandre.vassalottisetmessages: + msg97617
2009-11-24 17:42:49alexandre.vassalottisetfiles: + fix_bytes_reduce.diff
keywords: + patch
messages: + msg95690
2009-11-24 12:50:03pitrousetnosy: + alexandre.vassalotti
2009-11-23 15:56:36r.david.murraysetpriority: high

versions: + Python 3.2, - Python 3.0
keywords: + easy
nosy: + r.david.murray

messages: + msg95634
stage: test needed
2009-11-23 15:17:34shcreate