classification
Title: memoryview + bytes fails
Type: behavior Stage:
Components: Versions: Python 3.3
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, exarkun, glyph, pitrou
Priority: normal Keywords:

Created on 2012-09-15 10:27 by exarkun, last changed 2014-10-14 15:07 by skrah.

Messages (12)
msg170511 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2012-09-15 10:27
Python 3.3.0rc2+ (default:9def2209a839, Sep 10 2012, 08:44:51) 
[GCC 4.6.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> memoryview(b'foo') + b'bar'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'memoryview' and 'bytes'
>>> b'bar' + memoryview(b'foo')
b'barfoo'
>>>
msg170512 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-09-15 12:24
What is the expected outcome? memoryviews can't be resized, so
this scenario isn't possible:

>>> bytearray([1,2,3]) + b'123'
bytearray(b'\x01\x02\x03123')
msg170513 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-09-15 12:37
Just prepend the empty bytestring if you want to make sure the result is a bytes object:

>>> b'' + memoryview(b'foo') + b'bar'
b'foobar'

I think the following limitation may be more annoying, though:

>>> b''.join([memoryview(b'foo'), b'bar'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence item 0: expected bytes, memoryview found
msg170579 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2012-09-16 23:23
> What is the expected outcome? memoryviews can't be resized, so
this scenario isn't possible:

The same as `view.tobytes() + bytes`, but without the extra copy implied by `view.tobytes()`.

> Just prepend the empty bytestring if you want to make sure the result is a bytes object:

Or I could explicitly convert the memoryview to a bytes object.  That strikes me as rather preferable.  However, this defeats one use of memoryview, which is to avoid unnecessary copying.  So it might be suitable workaround for some cases, but not all.
msg170580 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-09-16 23:36
> Or I could explicitly convert the memoryview to a bytes object.  That
> strikes me as rather preferable.  However, this defeats one use of
> memoryview, which is to avoid unnecessary copying.  So it might be
> suitable workaround for some cases, but not all.

Indeed, that's why I think it would be good to fix the bytes.join()
method (which is precisely meant to minimize copying and resizing).
msg170619 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-09-17 18:03
Opened issue15958 for the bytes.join enhancement.
msg172676 - (view) Author: Glyph Lefkowitz (glyph) Date: 2012-10-11 19:05
It's worth noting that the "buffer()" built-in in Python2 had this behavior, and it enabled a copy-reduction optimization within Twisted's outgoing transport buffer.

There are of course other ways to do this, but it seems like it would be nice to restore this handy optimization; it seems like a bug, or at least an oversight, that the convenience 'bytes+memoryview' (which cannot provide a useful optimization) works, but 'memoryview+bytes' (which would be equally helpful from a convenience perspective _could_ provide a reduction in copying) doesn't.

Despite the bytes.join optimization (which, don't get me wrong, is also very helpful, almost necessary) this remains very useful.
msg172678 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-10-11 19:13
I'm not sure what you're talking about since:

>>> b = buffer("abc")
>>> b + "xyz"
'abcxyz'
>>> (b + "xyz") is b
False

... doesn't look like it avoid copies to me.
msg172687 - (view) Author: Glyph Lefkowitz (glyph) Date: 2012-10-11 20:10
Le Oct 11, 2012 à 12:13 PM, Antoine Pitrou <report@bugs.python.org> a écrit :

> 
> Antoine Pitrou added the comment:
> 
> I'm not sure what you're talking about since:
> 
>>>> b = buffer("abc")
>>>> b + "xyz"
> 'abcxyz'
>>>> (b + "xyz") is b
> False
> 
> ... doesn't look like it avoid copies to me.

The case where copies are avoided is documented here:

<http://twistedmatrix.com/trac/browser/trunk/twisted/internet/abstract.py?rev=35733#L20>
msg172688 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-10-11 20:16
> The case where copies are avoided is documented here

... which would be handled nicely by issue15958.
msg172689 - (view) Author: Glyph Lefkowitz (glyph) Date: 2012-10-11 20:27
Yes, it would be *possible* to fix it with that alone, but that still makes it a pointless 'gotcha' in differing behavior between memoryview and buffer, especially given that bytes+memoryview does something semantically different than memoryview+bytes for no reason.
msg172690 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-10-11 20:32
Well, the fact that memoryview + bytes wouldn't return you a memoryview object might be a good reason to disallow it. Compare with:

>>> bytearray(b"x") + b"y"
bytearray(b'xy')
>>> b"x" + bytearray(b"y")
b'xy'
History
Date User Action Args
2014-10-14 15:07:39skrahsetnosy: - skrah
2012-10-11 20:32:23pitrousetmessages: + msg172690
2012-10-11 20:27:14glyphsetmessages: + msg172689
2012-10-11 20:16:16pitrousetmessages: + msg172688
2012-10-11 20:10:53glyphsetmessages: + msg172687
2012-10-11 19:13:52pitrousetmessages: + msg172678
2012-10-11 19:05:13glyphsetnosy: + glyph
messages: + msg172676
2012-09-18 03:22:52Arfreversetnosy: + Arfrever
2012-09-17 18:03:51pitrousetmessages: + msg170619
2012-09-16 23:36:39pitrousetmessages: + msg170580
2012-09-16 23:23:01exarkunsetmessages: + msg170579
2012-09-15 12:37:25pitrousetnosy: + pitrou
messages: + msg170513
2012-09-15 12:24:15skrahsetnosy: + skrah
messages: + msg170512
2012-09-15 10:27:58exarkuncreate