Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behavior of bytearray slice assignment #52648

Closed
abacabadabacaba mannequin opened this issue Apr 14, 2010 · 18 comments
Closed

Strange behavior of bytearray slice assignment #52648

abacabadabacaba mannequin opened this issue Apr 14, 2010 · 18 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@abacabadabacaba
Copy link
Mannequin

abacabadabacaba mannequin commented Apr 14, 2010

BPO 8401
Nosy @loewis, @birkenfeld, @mdickinson, @pitrou, @ezio-melotti
Files
  • issue8401.diff
  • issue8401-2.diff: Patch with better tests against 3.2.
  • issue8401-3.diff: Patch with better tests and error messages against 3.2.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ezio-melotti'
    closed_at = <Date 2012-11-03.19:54:44.091>
    created_at = <Date 2010-04-14.15:41:48.878>
    labels = ['interpreter-core', 'type-bug']
    title = 'Strange behavior of bytearray slice assignment'
    updated_at = <Date 2012-11-03.19:54:44.090>
    user = 'https://bugs.python.org/abacabadabacaba'

    bugs.python.org fields:

    activity = <Date 2012-11-03.19:54:44.090>
    actor = 'ezio.melotti'
    assignee = 'ezio.melotti'
    closed = True
    closed_date = <Date 2012-11-03.19:54:44.091>
    closer = 'ezio.melotti'
    components = ['Interpreter Core']
    creation = <Date 2010-04-14.15:41:48.878>
    creator = 'abacabadabacaba'
    dependencies = []
    files = ['27753', '27777', '27779']
    hgrepos = []
    issue_num = 8401
    keywords = ['patch', 'needs review']
    message_count = 18.0
    messages = ['103133', '103135', '103138', '103292', '103296', '103298', '103299', '103302', '103306', '103345', '103402', '103474', '171350', '173985', '174132', '174670', '174676', '174681']
    nosy_count = 7.0
    nosy_names = ['loewis', 'georg.brandl', 'mark.dickinson', 'pitrou', 'ezio.melotti', 'abacabadabacaba', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue8401'
    versions = ['Python 2.7', 'Python 3.2', 'Python 3.3', 'Python 3.4']

    @abacabadabacaba
    Copy link
    Mannequin Author

    abacabadabacaba mannequin commented Apr 14, 2010

    >>> a = bytearray()
    >>> a[:] = 0 # Is it a feature?
    >>> a
    bytearray(b'')
    >>> a[:] = 10 # If so, why not document it?
    >>> a
    bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
    >>> a[:] = -1
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: negative count
    >>> a[:] = -1000000000000000000000 # This should raise ValueError, not TypeError.
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: 'int' object is not iterable
    >>> a[:] = 1000000000000000000
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    MemoryError
    >>> a[:] = 1000000000000000000000 # This should raise OverflowError, not TypeError.
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: 'int' object is not iterable
    >>> a[:] = [] # Are some empty sequences better than others?
    >>> a[:] = ()
    >>> a[:] = list("")
    >>> a[:] = ""
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: string argument without an encoding

    @abacabadabacaba abacabadabacaba mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error labels Apr 14, 2010
    @pitrou
    Copy link
    Member

    pitrou commented Apr 14, 2010

    It looks rather like a bug to me, and should be forbidden.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Apr 14, 2010

    pitrou: I agree, it should be a TypeError.

    @ezio-melotti
    Copy link
    Member

    This happens because bytearray_ass_subscript() (Objects/bytearrayobject.c:588) calls PyByteArray_FromObject() (:641) that in turn calls bytearray_init() (:746), so the results are similar to the ones returned by calling bytearray(...) directly.

    @ezio-melotti
    Copy link
    Member

    Here is a proof of concept that fixes the problem.

    The doc of bytearray() says about its first arg:

    • If it is a string, you must also give the encoding [...].
    • If it is an integer, the array will have that size and will be initialized with null bytes.
    • If it is an object conforming to the buffer interface, a read-only buffer of the object will be used to initialize the bytes array.
    • If it is an iterable, it must be an iterable of integers in the range 0 <= x < 256, which are used as the initial contents of the array.

    All these things except the string[1] and the integer seem OK to me while assigning to a slice, so in the patch I've special-cased ints to raise a TypeError (it fails already for unicode strings).

    [1]: note that string here means unicode string (the doc should probably be more specific about it.). Byte strings work fine, but for unicode strings there's no way to specify the encoding while doing ba[:] = u'ustring'.

    @abacabadabacaba
    Copy link
    Mannequin Author

    abacabadabacaba mannequin commented Apr 16, 2010

    Empty string is an iterable of integers in the range 0 <= x < 256, so it should be allowed.

    >>> all(isinstance(x, int) and 0 <= x < 256 for x in "")
    True
    >>> bytearray()[:] = ""
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: string argument without an encoding

    @ezio-melotti
    Copy link
    Member

    Not really, chars are not ints and anyway the empty string fall in the first case.

    @abacabadabacaba
    Copy link
    Mannequin Author

    abacabadabacaba mannequin commented Apr 16, 2010

    Not really, chars are not ints
    Yes, however, empty string contains exactly zero chars.
    and anyway the empty string fall in the first case.
    Strings aren't mentioned in documentation of bytearray slice assignment. However, I think that bytearray constructor should accept empty string too, without an encoding, for consistency.

    @abacabadabacaba
    Copy link
    Mannequin Author

    abacabadabacaba mannequin commented Apr 16, 2010

    __doc__ of bytearray says:

    bytearray(iterable_of_ints) -> bytearray
    bytearray(string, encoding[, errors]) -> bytearray
    bytearray(bytes_or_bytearray) -> mutable copy of bytes_or_bytearray
    bytearray(memory_view) -> bytearray
    So, unless an encoding is given, empty string should be interpreted as an iterable of ints. BTW, documentation and docstring should be made consistent with each other.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Apr 16, 2010

    -1 on assigning strings to slices of bytearrays. As Ezio mentions, this operation conceptually requires an encoding, and no encoding is readily available in the slice assignment.

    -1 on special-casing empty strings.

    @abacabadabacaba
    Copy link
    Mannequin Author

    abacabadabacaba mannequin commented Apr 17, 2010

    -1 on special-casing string without an encoding. Current code does (almost) this:
    ...
    if argument_is_a_string:
    if not encoding_is_given: # Special case
    raise TypeError("string argument without an encoding")
    encode_argument()
    return
    if encoding_is_given:
    raise TypeError("encoding or errors without a string argument")
    ...
    IMO, it should do this instead:
    ...
    if encoding_is_given:
    if not argument_is_a_string:
    raise TypeError("encoding or errors without a string argument")
    encode_argument()
    return
    ...
    This way, bytearray("") would work without any special cases.

    @birkenfeld
    Copy link
    Member

    Python is not (e.g.) Haskell; Python strings are not lists whose contents happen to be characters. Allowing an empty string here is a step backwards in the direction of "why not allow any string whose contents have an unambiguous meaning as bytes", i.e. the default encoding ASCII in Python 2.x. Passing a string where bytes are expected is a programming error, and it should be rewarded with an exception, no matter if the string happens to be empty or not.

    @birkenfeld birkenfeld self-assigned this Sep 4, 2010
    @ezio-melotti
    Copy link
    Member

    >>> a[:] = -1000000000000000000000 # This should raise ValueError, not TypeError.
    >>> a[:] = 1000000000000000000000 # This should raise OverflowError, not TypeError.

    FTR, these two now raise OverflowError.

    @birkenfeld birkenfeld assigned ezio-melotti and unassigned birkenfeld Oct 6, 2012
    @ezio-melotti
    Copy link
    Member

    Updated patch against default.

    @ezio-melotti
    Copy link
    Member

    The new patch further improve tests and error message, checking for both numbers and strings:

    >>> b = bytearray(b'fooooooo')
    >>> b[3:4] = 'foo'
    TypeError: can assign only bytes, buffers, or iterables of ints in range(0, 256)
    >>> b[3:4] = 5
    TypeError: can assign only bytes, buffers, or iterables of ints in range(0, 256)
    >>> b[3:4] = 5.2
    TypeError: can assign only bytes, buffers, or iterables of ints in range(0, 256)
    >>> b[3:4] = None
    TypeError: 'NoneType' object is not iterable

    Before the patch these errors were reported instead:

    >>> b = bytearray(b'fooooooo')
    >>> b[3:4] = 'foo'  # can't provide encoding here
    TypeError: string argument without an encoding
    >>> b[3:4] = 5  # this "worked"
    >>> b[3:4] = 5.2
    TypeError: 'float' object is not iterable
    >>> b[3:4] = None
    TypeError: 'NoneType' object is not iterable

    @mdickinson
    Copy link
    Member

    The patch looks fine to me.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 3, 2012

    New changeset 1bd2b272c568 by Ezio Melotti in branch '2.7':
    bpo-8401: assigning an int to a bytearray slice (e.g. b[3:4] = 5) now raises an error.
    http://hg.python.org/cpython/rev/1bd2b272c568

    New changeset 8f00af8abaf9 by Ezio Melotti in branch '3.2':
    bpo-8401: assigning an int to a bytearray slice (e.g. b[3:4] = 5) now raises an error.
    http://hg.python.org/cpython/rev/8f00af8abaf9

    New changeset 06577f6b1c99 by Ezio Melotti in branch '3.3':
    bpo-8401: merge with 3.2.
    http://hg.python.org/cpython/rev/06577f6b1c99

    New changeset db40752c6cc7 by Ezio Melotti in branch 'default':
    bpo-8401: merge with 3.3.
    http://hg.python.org/cpython/rev/db40752c6cc7

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 3, 2012

    New changeset a7ebc0db5c18 by Ezio Melotti in branch 'default':
    Merge typo fixes (and the fix for bpo-8401 that I wrongly merged) with 3.3.
    http://hg.python.org/cpython/rev/a7ebc0db5c18

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants