Issue 13298: Result type depends on order of operands for bytes and bytearray

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/57507

classification

Title:	Result type depends on order of operands for bytes and bytearray
Type:	behavior	Stage:
Components:	Interpreter Core	Versions:	Python 3.11

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	eric.araujo, flox, iritkatriel, meador.inge, ncoghlan, petri.lehtinen, pitrou, serhiy.storchaka, terry.reedy
Priority:	normal	Keywords:

Created on 2011-10-31 00:14 by ncoghlan, last changed 2022-04-11 14:57 by admin.

Messages (6)
msg146669 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2011-10-31 00:14
In a recent python-ideas discussion of the differences between concatenation and augmented assignment on lists, I pointed out the general guiding principle behind Python's binary operation semantics was that the type of a binary operation should not depend on the order of the operands. That is "X op Y" and "Y op X" should either consistently create results of the same type ("1 + 1.1", "1.1 + 1") or else throw an exception ("[] + ()", "() + []"). This principle is why list concatenation normally only works with other lists, but will accept arbitrary iterables for augmented assignment. collections.deque exhibits similar behaviour (i.e. strict on the binary operation, permissive on augmented assignment). However, bytes and bytearray don't follow this principle - they accept anything that implements the buffer interface even in the binary operation, leading to the following asymmetries: >>> b'' + bytearray() b'' >>> b'' + memoryview(b'') b'' >>> bytearray() + b'' bytearray(b'') >>> bytearray() + memoryview(b'') bytearray(b'') >>> memoryview(b'') + b'' Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +: 'memoryview' and 'bytes' >>> memoryview(b'') + bytearray(b'') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +: 'memoryview' and 'bytearray' Now, the latter two cases are due to a known problem where returning NotImplemented from sq_concat or sq_repeat doesn't work properly (so none of the relevant method implementations in the stdlib even try), but the bytes and bytearray interaction is exactly the kind of type asymmetry the operand order independence guideline is intended to prevent. My question is - do we care enough to try to change this? If we do, then it's necessary to decide on more appropriate semantics: 1. The "list" solution, permitting only the same type in binary operations (high risk of breaking quite a lot of code) 2. Don't allow arbitrary buffers, but do allow bytes/bytearray interoperability 2a. always return bytes from mixed operations 2b. always return bytearray from mixed operations 2c. return the type of the first operand (ala set.__or__) Or just accept that this really is more of a guideline than a rule and adjust the documentation accordingly.
msg146890 - (view)	Author: Antoine Pitrou (pitrou) *	Date: 2011-11-03 02:25
I think the current behaviour is fine, in that the alternatives are not better at all. In the absence of a type inherently "superior" to the others (as float can be to int, except for very large integers :-)), it makes sense to keep the type of the left-hand argument. Note that .join() has a slightly different behaviour: >>> b"".join([bytearray(), b""]) b'' >>> bytearray().join([bytearray(), b""]) bytearray(b'') >>> b"".join([bytearray(), memoryview(b"")]) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: sequence item 1: expected bytes, memoryview found
msg146897 - (view)	Author: Petri Lehtinen (petri.lehtinen) *	Date: 2011-11-03 07:47
> Note that .join() has a slightly different behaviour: > > >>> b"".join([bytearray(), b""]) > b'' > >>> bytearray().join([bytearray(), b""]) > bytearray(b'') > >>> b"".join([bytearray(), memoryview(b"")]) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > TypeError: sequence item 1: expected bytes, memoryview found I thinks this is worth fixing. Is there an issue already?
msg146901 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2011-11-03 08:10
We can just use this one - it was more in the nature of a question "is there anything we want to change about the status quo?" than a request for any specific change. I'm actually OK with buffer API based interoperability, but if we're going to offer that, we should be consistent: 1. bytes and bytearray should interoperate with anything supporting the buffer interface (which they already mostly do) 2. When they encounter each other, LHS wins (as with set() and frozenset()) 3. We should fix the operand coercion bug for C level concatenation slot implementations (already covered by issue #11477) 4. Update the documentation as needed Since we're tinkering with builtin behaviour, 1 & 2 should probably be brought up on python-dev once someone checks if there is anything other than .join() that needs updating.
msg261281 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2016-03-07 06:46
An issue with bytes.join() is already fixed (issue15958).
msg405001 - (view)	Author: Irit Katriel (iritkatriel) *	Date: 2021-10-25 21:23
Reproduced on 3.11.

History
Date	User	Action	Args
2022-04-11 14:57:23	admin	set	github: 57507
2021-10-25 21:23:53	iritkatriel	set	nosy: + iritkatriel messages: + msg405001 versions: + Python 3.11, - Python 3.3
2016-03-07 06:46:41	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg261281
2011-11-12 11:19:29	eric.araujo	set	nosy: + eric.araujo
2011-11-04 22:15:26	terry.reedy	set	nosy: + terry.reedy
2011-11-03 08:10:36	ncoghlan	set	messages: + msg146901
2011-11-03 07:47:36	petri.lehtinen	set	messages: + msg146897
2011-11-03 02:25:51	pitrou	set	nosy: + pitrou messages: + msg146890
2011-11-02 02:17:10	meador.inge	set	nosy: + meador.inge
2011-10-31 09:22:32	petri.lehtinen	set	nosy: + petri.lehtinen
2011-10-31 00:39:38	flox	set	nosy: + flox
2011-10-31 00:14:37	ncoghlan	create