classification
Title: Make str and bytes error messages on concatenation conform with other sequences
Type: enhancement Stage:
Components: Interpreter Core Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Jim Fasarakis-Hilliard, levkivskyi, ned.deily, rhettinger, serhiy.storchaka, terry.reedy, xiang.zhang
Priority: normal Keywords:

Created on 2016-12-30 18:56 by Jim Fasarakis-Hilliard, last changed 2017-03-24 22:07 by serhiy.storchaka.

Pull Requests
URL Status Linked Edit
PR 709 closed serhiy.storchaka, 2017-03-18 10:18
PR 710 merged serhiy.storchaka, 2017-03-18 10:19
PR 723 merged serhiy.storchaka, 2017-03-19 18:18
PR 724 merged serhiy.storchaka, 2017-03-19 18:18
Messages (17)
msg284340 - (view) Author: Jim Fasarakis-Hilliard (Jim Fasarakis-Hilliard) * Date: 2016-12-30 18:56
Specifically, bytes (always, from what I could find) had this error message:

    >>> b'' + ''
    TypeError: can't concat bytes to str

while str, after a change in issue26057, was made to:

    >>> '' + b''
    TypeError: must be str, not bytes

from the previous form of "TypeError: Can't convert 'bytes' object to str implicitly".

I think these could be changed to conform with what the other sequences generate and fall in line with the error messages produced for other operations.

Specifically changing them to:

    >>> b'' + ''
    TypeError: can only concatenate bytes (not 'str') to bytes

and similarly for str.

If this idea makes sense, I can attach a patch that addresses it.
msg284341 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-30 20:12
Old error message is misleading. It implied that there are objects that can be converted to str implicitly. This was true in Python 2, but is false in Python 3.

New error message conforms with TypeError messages produced by PyArg_Parse*(). It is the same in a number of str methods. It doesn't conform with the error messages for other produced by concatenating other types. But different messages are generated for different types:

>>> b'' + 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't concat bytes to int
>>> a = bytearray(); a += 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't concat int to bytearray
>>> [] + 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate list (not "int") to list
>>> a = []; a += 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> import array; array.array('b') + []
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only append array (not "list") to array
>>> import array; a = array.array('b'); a += []
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only extend array with array (not "list")
>>> import operator; operator.concat(1, '')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object can't be concatenated

I think it would be better to unify them all.

I'm not sure that this change can be considered as a bug fix rather than an enhancement. Leave this on 3.6 release manager.
msg284348 - (view) Author: Jim Fasarakis-Hilliard (Jim Fasarakis-Hilliard) * Date: 2016-12-30 21:53
Should that message be the one predominantly used for sequences, i.e:

   TypeError: can only concatenate class1 (not "class2") to class1

or should another one be used like "Unsupported operand type(s) for op: 'class1' and 'class2'?

The first is problematic with cases like `+=` where an iterable is accepted, the second seems better to me at least.

As for `operator.concat`, any reason why the check is made beforehand to see if the first argument has a `__getitem__` method? Couldn't that just be removed allowing the exception from `concat(1, '')` to just propagate to the caller?
msg284370 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-31 06:54
Different cases accept different types. Not always we can say that only one specific type is accepted.

"list +" accepts only lists, "list +=" is a syntax sugar for list.extend and accepts any iterables. "bytes +" and "bytearray +=" accept any objects that support the buffer protocol. "array +" and "array +=" accept only arrays.

Changing semantic is out of the scope of this issue. I think there are reasons for current behavior.
msg284372 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-31 07:15
Originally this issue was raised on StackOverflow:

http://stackoverflow.com/questions/41388606/python-3-6-vs-3-5-typeerror-message-on-string-concatenation
msg284545 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2017-01-03 09:15
> As for `operator.concat`, any reason why the check is made beforehand to see if the first argument has a `__getitem__` method?

Concatenation is an operation of sequence (+ could be used both for concatenation and addition), so you have to check __getitem__ to make sure it's a sequence.

> Couldn't that just be removed allowing the exception from `concat(1, '')` to just propagate to the caller?

If removed, `concat(1, '')` could propagate exceptions but not `concat(1, 1)`.
msg284855 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-01-06 21:32
By default, error message wording changes are 'enhancements' that wait for the next x.y.0 release unless the current wording is positively wrong'.  This is different from doc changes because there are tests depending on error messages. 'Inconsistent' or 'awkward' is not the same as 'wrong'.

I do agree that the clipped "must be str, not bytes' is awkward.  What (which) is it that must be str?  A fleshed out and positive "can only concatenate str (not bytes) to str." would be clearer and educate users better.

As Serhiy said, the exact equivalent cannot be said for bytes.

>>> b'a' + bytearray(b'b')
b'ab'
>>> b'a' + memoryview(b'b')
b'ab'

However "Can only concatenate bytes, bytearray, or memoryview (not x) to bytes." would be good if this is in fact complete. I need to be educated on this ;-)  I was not sure about memoryview until I tried it.
msg284861 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-01-06 22:08
You can concatenate any object supporting the buffer protocol to bytes and bytearray.

Current error message for bytes looks awkward too, because it says "bytes to othertype" instead of "othertype to bytes".
msg285213 - (view) Author: Jim Fasarakis-Hilliard (Jim Fasarakis-Hilliard) * Date: 2017-01-11 11:02
As I currently see this: 

 - The error message for str can be changed to the one used for other sequences 'can only concatenate str (not "type") to str'

 - The error message for arrays can be changed to use concatenate instead of append, too.

For bytes I see a conundrum, on one hand it can be changed to mention the buffer protocol which might confuse new users more than it helps them or, it can try and mention the objects that currently conform to it which might be exhaustive and long.
msg288922 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2017-03-03 20:24
I'm removing myself as assignee as this doesn't seem to need a 3.6 RM decision at this point.  If necessary, we can discuss a 3.6 backport after the issue has been resolved for 3.7.
msg289803 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-18 10:25
PR 709 fixes the order of types for concatenating bytes and and bytearray. b''+0 will raise "can't concat str to bytes" rather than "can't concat bytes to str".

PR 710 makes error message for str concatenation more informative and similar to error messages for list, tuple, deque, array. '' + b'' will raise "can only concatenate str (not "bytes") to str" rather than "must be str, not bytes".
msg289866 - (view) Author: Ivan Levkivskyi (levkivskyi) * Date: 2017-03-19 23:10
Something is strange: PRs 709, 723, 724 are shown as open in the "Pull Requests" section on this page. However, all four PRs are already merged.

Are other see the same? Shouldn't status be automatically updated?
msg289877 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-20 07:52
I don't know why statuses was not updated automatically. Updated them manually.

Ivan in the comment on GitHub suggested to use "concatenate" instead of "concat" in "can't concat <type> to bytes". Maybe make it more similar to messages for list, tuple, deque, array? "can only concatenate objects supporting the buffer protocol (not "<type>") to bytes"?
msg290151 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-24 22:05
New changeset 2c5b2c3832d4d2af7b60333a5a8f73dd51ef6245 by Serhiy Storchaka in branch '3.5':
bpo-29116: Fix error messages for concatenating bytes and bytearray with unsupported type. (#709) (#724)
https://github.com/python/cpython/commit/2c5b2c3832d4d2af7b60333a5a8f73dd51ef6245
msg290152 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-24 22:05
New changeset 3d258b1eb453bcbc412d6b252f5bdceae0303f07 by Serhiy Storchaka in branch '3.6':
bpo-29116: Fix error messages for concatenating bytes and bytearray with unsupported type. (#709) (#723)
https://github.com/python/cpython/commit/3d258b1eb453bcbc412d6b252f5bdceae0303f07
msg290158 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-24 22:06
New changeset 6b5a9ec4788770c652bac3bf5d5a0a3b710b82ae by Serhiy Storchaka in branch 'master':
bpo-29116: Fix error messages for concatenating bytes and bytearray with unsupported type. (#709)
https://github.com/python/cpython/commit/6b5a9ec4788770c652bac3bf5d5a0a3b710b82ae
msg290159 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-24 22:07
New changeset 004e03fb0c2febe2ec8afbd28ffcb3e980c63228 by Serhiy Storchaka in branch 'master':
bpo-29116: Improve error message for concatenating str with non-str. (#710)
https://github.com/python/cpython/commit/004e03fb0c2febe2ec8afbd28ffcb3e980c63228
History
Date User Action Args
2017-03-24 22:07:04serhiy.storchakasetmessages: + msg290159
2017-03-24 22:06:51serhiy.storchakasetmessages: + msg290158
2017-03-24 22:05:31serhiy.storchakasetmessages: + msg290152
2017-03-24 22:05:24serhiy.storchakasetmessages: + msg290151
2017-03-20 07:52:23serhiy.storchakasetmessages: + msg289877
2017-03-19 23:10:25levkivskyisetmessages: + msg289866
2017-03-19 18:18:51serhiy.storchakasetpull_requests: + pull_request639
2017-03-19 18:18:00serhiy.storchakasetpull_requests: + pull_request638
2017-03-18 10:25:36serhiy.storchakasetmessages: + msg289803
2017-03-18 10:19:45serhiy.storchakasetpull_requests: + pull_request630
2017-03-18 10:18:32serhiy.storchakasetpull_requests: + pull_request629
2017-03-03 20:24:29ned.deilysetassignee: ned.deily ->
messages: + msg288922
nosy: rhettinger, terry.reedy, ned.deily, serhiy.storchaka, levkivskyi, xiang.zhang, Jim Fasarakis-Hilliard
2017-01-11 11:02:14Jim Fasarakis-Hilliardsetmessages: + msg285213
2017-01-06 22:08:35serhiy.storchakasetmessages: + msg284861
2017-01-06 21:32:18terry.reedysetversions: - Python 3.6
nosy: + terry.reedy

messages: + msg284855

type: behavior -> enhancement
2017-01-06 19:40:14rhettingersetnosy: + rhettinger
2017-01-06 17:19:15levkivskyisetnosy: + levkivskyi
2017-01-03 09:15:09xiang.zhangsetnosy: + xiang.zhang
messages: + msg284545
2016-12-31 07:15:34serhiy.storchakasetmessages: + msg284372
2016-12-31 06:54:11serhiy.storchakasetmessages: + msg284370
2016-12-30 21:53:37Jim Fasarakis-Hilliardsetmessages: + msg284348
2016-12-30 20:12:01serhiy.storchakasetassignee: ned.deily

messages: + msg284341
nosy: + serhiy.storchaka, ned.deily
2016-12-30 18:56:51Jim Fasarakis-Hilliardcreate