classification
Title: Strange behavior of bytearray slice assignment
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.4, Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: abacabadabacaba, ezio.melotti, georg.brandl, loewis, mark.dickinson, pitrou, python-dev
Priority: normal Keywords: needs review, patch

Created on 2010-04-14 15:41 by abacabadabacaba, last changed 2012-11-03 19:54 by ezio.melotti. This issue is now closed.

Files
File name Uploaded Description Edit
issue8401.diff ezio.melotti, 2012-10-27 22:00 review
issue8401-2.diff ezio.melotti, 2012-10-29 16:28 Patch with better tests against 3.2.
issue8401-3.diff ezio.melotti, 2012-10-29 17:01 Patch with better tests and error messages against 3.2.
Messages (18)
msg103133 - (view) Author: Evgeny Kapun (abacabadabacaba) Date: 2010-04-14 15:41
>>> a = bytearray()
>>> a[:] = 0 # Is it a feature?
>>> a
bytearray(b'')
>>> a[:] = 10 # If so, why not document it?
>>> a
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
>>> a[:] = -1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: negative count
>>> a[:] = -1000000000000000000000 # This should raise ValueError, not TypeError.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> a[:] = 1000000000000000000
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError
>>> a[:] = 1000000000000000000000 # This should raise OverflowError, not TypeError.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> a[:] = [] # Are some empty sequences better than others?
>>> a[:] = ()
>>> a[:] = list("")
>>> a[:] = ""
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string argument without an encoding
msg103135 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-14 15:47
It looks rather like a bug to me, and should be forbidden.
msg103138 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-04-14 18:39
pitrou: I agree, it should be a TypeError.
msg103292 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2010-04-16 05:56
This happens because bytearray_ass_subscript() (Objects/bytearrayobject.c:588) calls PyByteArray_FromObject() (:641) that in turn calls bytearray_init() (:746), so the results are similar to the ones returned by calling bytearray(...) directly.
msg103296 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2010-04-16 06:45
Here is a proof of concept that fixes the problem.

The doc of bytearray() says about its first arg:
 * If it is a string, you must also give the encoding [...].
 * If it is an integer, the array will have that size and will be initialized with null bytes.
 * If it is an object conforming to the buffer interface, a read-only buffer of the object will be used to initialize the bytes array.
 * If it is an iterable, it must be an iterable of integers in the range 0 <= x < 256, which are used as the initial contents of the array.

All these things except the string[1] and the integer seem OK to me while assigning to a slice, so in the patch I've special-cased ints to raise a TypeError (it fails already for unicode strings).

[1]: note that string here means unicode string (the doc should probably be more specific about it.). Byte strings work fine, but for unicode strings there's no way to specify the encoding while doing ba[:] = u'ustring'.
msg103298 - (view) Author: Evgeny Kapun (abacabadabacaba) Date: 2010-04-16 07:27
Empty string is an iterable of integers in the range 0 <= x < 256, so it should be allowed.

>>> all(isinstance(x, int) and 0 <= x < 256 for x in "")
True
>>> bytearray()[:] = ""
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string argument without an encoding
msg103299 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2010-04-16 07:35
Not really, chars are not ints and anyway the empty string fall in the first case.
msg103302 - (view) Author: Evgeny Kapun (abacabadabacaba) Date: 2010-04-16 08:13
> Not really, chars are not ints
Yes, however, empty string contains exactly zero chars.
> and anyway the empty string fall in the first case.
Strings aren't mentioned in documentation of bytearray slice assignment. However, I think that bytearray constructor should accept empty string too, without an encoding, for consistency.
msg103306 - (view) Author: Evgeny Kapun (abacabadabacaba) Date: 2010-04-16 08:51
__doc__ of bytearray says:
> bytearray(iterable_of_ints) -> bytearray
> bytearray(string, encoding[, errors]) -> bytearray
> bytearray(bytes_or_bytearray) -> mutable copy of bytes_or_bytearray
> bytearray(memory_view) -> bytearray
So, unless an encoding is given, empty string should be interpreted as an iterable of ints. BTW, documentation and docstring should be made consistent with each other.
msg103345 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-04-16 17:54
-1 on assigning strings to slices of bytearrays. As Ezio mentions, this operation conceptually requires an encoding, and no encoding is readily available in the slice assignment.

-1 on special-casing empty strings.
msg103402 - (view) Author: Evgeny Kapun (abacabadabacaba) Date: 2010-04-17 14:15
-1 on special-casing string without an encoding. Current code does (almost) this:
...
if argument_is_a_string:
	if not encoding_is_given: # Special case
		raise TypeError("string argument without an encoding")
	encode_argument()
	return
if encoding_is_given:
	raise TypeError("encoding or errors without a string argument")
...
IMO, it should do this instead:
...
if encoding_is_given:
	if not argument_is_a_string:
		raise TypeError("encoding or errors without a string argument")
	encode_argument()
	return
...
This way, bytearray("") would work without any special cases.
msg103474 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-04-18 09:27
Python is not (e.g.) Haskell; Python strings are not lists whose contents happen to be characters.  Allowing an empty string here is a step backwards in the direction of "why not allow any string whose contents have an unambiguous meaning as bytes", i.e. the default encoding ASCII in Python 2.x.  Passing a string where bytes are expected is a programming error, and it should be rewarded with an exception, no matter if the string happens to be empty or not.
msg171350 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-09-26 17:22
> >>> a[:] = -1000000000000000000000 # This should raise ValueError, not TypeError.
> >>> a[:] = 1000000000000000000000 # This should raise OverflowError, not TypeError.

FTR, these two now raise OverflowError.
msg173985 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-10-27 22:00
Updated patch against default.
msg174132 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-10-29 17:01
The new patch further improve tests and error message, checking for both numbers and strings:

>>> b = bytearray(b'fooooooo')
>>> b[3:4] = 'foo'
TypeError: can assign only bytes, buffers, or iterables of ints in range(0, 256)
>>> b[3:4] = 5
TypeError: can assign only bytes, buffers, or iterables of ints in range(0, 256)
>>> b[3:4] = 5.2
TypeError: can assign only bytes, buffers, or iterables of ints in range(0, 256)
>>> b[3:4] = None
TypeError: 'NoneType' object is not iterable


Before the patch these errors were reported instead:

>>> b = bytearray(b'fooooooo')
>>> b[3:4] = 'foo'  # can't provide encoding here
TypeError: string argument without an encoding
>>> b[3:4] = 5  # this "worked"
>>> b[3:4] = 5.2
TypeError: 'float' object is not iterable
>>> b[3:4] = None
TypeError: 'NoneType' object is not iterable
msg174670 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-11-03 18:51
The patch looks fine to me.
msg174676 - (view) Author: Roundup Robot (python-dev) Date: 2012-11-03 19:25
New changeset 1bd2b272c568 by Ezio Melotti in branch '2.7':
#8401: assigning an int to a bytearray slice (e.g. b[3:4] = 5) now raises an error.
http://hg.python.org/cpython/rev/1bd2b272c568

New changeset 8f00af8abaf9 by Ezio Melotti in branch '3.2':
#8401: assigning an int to a bytearray slice (e.g. b[3:4] = 5) now raises an error.
http://hg.python.org/cpython/rev/8f00af8abaf9

New changeset 06577f6b1c99 by Ezio Melotti in branch '3.3':
#8401: merge with 3.2.
http://hg.python.org/cpython/rev/06577f6b1c99

New changeset db40752c6cc7 by Ezio Melotti in branch 'default':
#8401: merge with 3.3.
http://hg.python.org/cpython/rev/db40752c6cc7
msg174681 - (view) Author: Roundup Robot (python-dev) Date: 2012-11-03 19:36
New changeset a7ebc0db5c18 by Ezio Melotti in branch 'default':
Merge typo fixes (and the fix for #8401 that I wrongly merged) with 3.3.
http://hg.python.org/cpython/rev/a7ebc0db5c18
History
Date User Action Args
2012-11-03 19:54:44ezio.melottisetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2012-11-03 19:36:46python-devsetmessages: + msg174681
2012-11-03 19:25:03python-devsetnosy: + python-dev
messages: + msg174676
2012-11-03 18:51:38mark.dickinsonsetnosy: + mark.dickinson
messages: + msg174670
2012-10-29 17:01:25ezio.melottisetfiles: + issue8401-3.diff

messages: + msg174132
2012-10-29 16:28:35ezio.melottisetfiles: + issue8401-2.diff
2012-10-27 22:00:54ezio.melottisetkeywords: + needs review
files: + issue8401.diff
messages: + msg173985
2012-10-27 21:57:06ezio.melottisetfiles: - issue8401.diff
2012-10-06 11:54:05georg.brandlsetassignee: georg.brandl -> ezio.melotti
2012-09-26 17:22:10ezio.melottisetmessages: + msg171350
versions: + Python 3.3, Python 3.4, - Python 3.1
2010-09-04 08:02:38georg.brandlsetassignee: georg.brandl
2010-09-04 00:13:27pitrousetstage: test needed -> patch review
versions: - Python 2.6
2010-04-18 09:27:01georg.brandlsetnosy: + georg.brandl
messages: + msg103474
2010-04-17 14:15:42abacabadabacabasetmessages: + msg103402
2010-04-16 17:54:25loewissetmessages: + msg103345
2010-04-16 08:51:54abacabadabacabasetmessages: + msg103306
2010-04-16 08:13:45abacabadabacabasetmessages: + msg103302
2010-04-16 07:35:36ezio.melottisetmessages: + msg103299
2010-04-16 07:27:12abacabadabacabasetmessages: + msg103298
2010-04-16 06:45:05ezio.melottisetfiles: + issue8401.diff
keywords: + patch
messages: + msg103296

stage: needs patch -> test needed
2010-04-16 05:56:55ezio.melottisetnosy: + ezio.melotti
messages: + msg103292
2010-04-14 18:39:04loewissetnosy: + loewis
messages: + msg103138
2010-04-14 15:47:04pitrousetpriority: normal
versions: + Python 2.6, Python 2.7, Python 3.2
nosy: + pitrou

messages: + msg103135

stage: needs patch
2010-04-14 15:41:48abacabadabacabacreate