New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange behavior of bytearray slice assignment #52648
Comments
>>> a = bytearray()
>>> a[:] = 0 # Is it a feature?
>>> a
bytearray(b'')
>>> a[:] = 10 # If so, why not document it?
>>> a
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
>>> a[:] = -1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: negative count
>>> a[:] = -1000000000000000000000 # This should raise ValueError, not TypeError.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> a[:] = 1000000000000000000
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
>>> a[:] = 1000000000000000000000 # This should raise OverflowError, not TypeError.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> a[:] = [] # Are some empty sequences better than others?
>>> a[:] = ()
>>> a[:] = list("")
>>> a[:] = ""
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string argument without an encoding |
It looks rather like a bug to me, and should be forbidden. |
pitrou: I agree, it should be a TypeError. |
This happens because bytearray_ass_subscript() (Objects/bytearrayobject.c:588) calls PyByteArray_FromObject() (:641) that in turn calls bytearray_init() (:746), so the results are similar to the ones returned by calling bytearray(...) directly. |
Here is a proof of concept that fixes the problem. The doc of bytearray() says about its first arg:
All these things except the string[1] and the integer seem OK to me while assigning to a slice, so in the patch I've special-cased ints to raise a TypeError (it fails already for unicode strings). [1]: note that string here means unicode string (the doc should probably be more specific about it.). Byte strings work fine, but for unicode strings there's no way to specify the encoding while doing ba[:] = u'ustring'. |
Empty string is an iterable of integers in the range 0 <= x < 256, so it should be allowed. >>> all(isinstance(x, int) and 0 <= x < 256 for x in "")
True
>>> bytearray()[:] = ""
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string argument without an encoding |
Not really, chars are not ints and anyway the empty string fall in the first case. |
|
__doc__ of bytearray says:
|
-1 on assigning strings to slices of bytearrays. As Ezio mentions, this operation conceptually requires an encoding, and no encoding is readily available in the slice assignment. -1 on special-casing empty strings. |
-1 on special-casing string without an encoding. Current code does (almost) this: |
Python is not (e.g.) Haskell; Python strings are not lists whose contents happen to be characters. Allowing an empty string here is a step backwards in the direction of "why not allow any string whose contents have an unambiguous meaning as bytes", i.e. the default encoding ASCII in Python 2.x. Passing a string where bytes are expected is a programming error, and it should be rewarded with an exception, no matter if the string happens to be empty or not. |
FTR, these two now raise OverflowError. |
Updated patch against default. |
The new patch further improve tests and error message, checking for both numbers and strings: >>> b = bytearray(b'fooooooo')
>>> b[3:4] = 'foo'
TypeError: can assign only bytes, buffers, or iterables of ints in range(0, 256)
>>> b[3:4] = 5
TypeError: can assign only bytes, buffers, or iterables of ints in range(0, 256)
>>> b[3:4] = 5.2
TypeError: can assign only bytes, buffers, or iterables of ints in range(0, 256)
>>> b[3:4] = None
TypeError: 'NoneType' object is not iterable Before the patch these errors were reported instead: >>> b = bytearray(b'fooooooo')
>>> b[3:4] = 'foo' # can't provide encoding here
TypeError: string argument without an encoding
>>> b[3:4] = 5 # this "worked"
>>> b[3:4] = 5.2
TypeError: 'float' object is not iterable
>>> b[3:4] = None
TypeError: 'NoneType' object is not iterable |
The patch looks fine to me. |
New changeset 1bd2b272c568 by Ezio Melotti in branch '2.7': New changeset 8f00af8abaf9 by Ezio Melotti in branch '3.2': New changeset 06577f6b1c99 by Ezio Melotti in branch '3.3': New changeset db40752c6cc7 by Ezio Melotti in branch 'default': |
New changeset a7ebc0db5c18 by Ezio Melotti in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: