classification
Title: bytes() should respect __bytes__
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.0
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: barry Nosy List: barry, benjamin.peterson, christian.heimes, pitrou
Priority: release blocker Keywords: needs review, patch

Created on 2008-03-19 03:04 by barry, last changed 2008-08-26 17:08 by benjamin.peterson. This issue is now closed.

Files
File name Uploaded Description Edit
2415.diff benjamin.peterson, 2008-08-21 17:18
Messages (9)
msg64027 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2008-03-19 03:04
The bytes() builtin should respect an __bytes__() converter if it exists.
E.g. instead of

>>> class Foo:
...  def __bytes__(self): return b'foo'
... 
>>> bytes(Foo())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'Foo' object is not iterable
>>> 

bytes(Foo()) should return b'foo'

Here's one use case.  email.header.Header instances represent email headers
(naturally) that conceptually are bytes, but also have a string
representation.  Say for example, a Subject header comes across the wire in
RFC 2033 encoded utf-8.  The unicode representation would be the value
of the
header decoded according to the RFC.  The bytes representation would be the
raw bytes seen on the wire.

The most natural way to retrieve each representation would be

>>> header = msg['subject']
>>> str(header)
'some string with non-ascii'
>>> bytes(header)
b'the rfc 2033 encoded raw header value'
msg64061 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-03-19 15:32
I took a quick glance at this. It hinges on how the C-API is going to
look. Currently, bytes is known in C as PyString and gets it's
representation from __str__. Although we could just change it to
__bytes__, Christian has said that he is going to rename it to PyBytes
(and what is now PyBytes -> PyByteArray). [1] Further muddying the
waters is the fact that PyObject_Str generates the unicode
representation of an object and should really be called PyObject_Unicode.

[1] http://mail.python.org/pipermail/python-3000/2008-March/012477.html
msg71660 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-08-21 17:18
Here's a patch. It's only implemented for bytes. Doing this for
bytearray would require a bit of refactoring, and can I think wait for
3.1. I added two new C functions. PyObject_Bytes and PyBytes_FromObject.
You can review it at http://codereview.appspot.com/3245.
msg71684 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-21 21:22
Isn't it a new feature and, therefore, should wait for 3.1?
msg71687 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-08-21 21:32
Well, yes I suppose. However, I think it's a serious enough deficiency
that it should block. I'll let Barry decide, though.
msg71972 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2008-08-26 09:07
Well, if I figured out how to use Rietveld correctly, I've left some
questions for you in the review.  It looks basically pretty good, so if
you could answer those questions, you can commit the change.

Should __bytes__ support be backported to 2.6?
msg71974 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-26 10:34
> Should __bytes__ support be backported to 2.6?

Isn't it already there in __str__?
Or do you mean just add support for the alternate method name?
msg71976 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2008-08-26 12:27
yep, that's all i meant.  it might not be worth it though.
msg71987 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2008-08-26 17:08
Thanks for the review, Barry! Committed in r66038. Sort of backported in
r66039 by aliasing PyObject_Bytes to PyObject_Str.
History
Date User Action Args
2008-08-26 17:08:56benjamin.petersonsetstatus: open -> closed
resolution: fixed
messages: + msg71987
2008-08-26 12:27:55barrysetmessages: + msg71976
2008-08-26 10:34:54pitrousetmessages: + msg71974
2008-08-26 09:07:46barrysetmessages: + msg71972
2008-08-21 21:32:39benjamin.petersonsetassignee: barry
messages: + msg71687
2008-08-21 21:22:23pitrousetnosy: + pitrou
messages: + msg71684
2008-08-21 20:35:50benjamin.petersonsetkeywords: + needs review
2008-08-21 17:18:53benjamin.petersonsetfiles: + 2415.diff
keywords: + patch
messages: + msg71660
2008-08-21 13:18:22benjamin.petersonsetpriority: normal -> release blocker
2008-03-19 15:32:54benjamin.petersonsetnosy: + christian.heimes, benjamin.peterson
messages: + msg64061
2008-03-19 03:04:14barrycreate