Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytearray methods center, ljust, rjust don't accept a bytearray as the fill character #56589

Closed
py-user mannequin opened this issue Jun 21, 2011 · 16 comments
Closed

bytearray methods center, ljust, rjust don't accept a bytearray as the fill character #56589

py-user mannequin opened this issue Jun 21, 2011 · 16 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@py-user
Copy link
Mannequin

py-user mannequin commented Jun 21, 2011

BPO 12380
Nosy @rhettinger, @terryjreedy, @ncoghlan, @pitrou, @vstinner, @bitdancer, @py-user, @akheron
Files
  • c_format_bytearray.patch: Allow bytearray for 'c' format
  • c_format_bytearray_plus_additional_tests.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2011-07-30.04:13:44.979>
    created_at = <Date 2011-06-21.03:50:10.369>
    labels = ['interpreter-core', 'type-feature']
    title = "bytearray methods center, ljust, rjust don't accept a bytearray as the fill character"
    updated_at = <Date 2011-07-30.04:13:44.978>
    user = 'https://github.com/py-user'

    bugs.python.org fields:

    activity = <Date 2011-07-30.04:13:44.978>
    actor = 'eli.bendersky'
    assignee = 'none'
    closed = True
    closed_date = <Date 2011-07-30.04:13:44.979>
    closer = 'eli.bendersky'
    components = ['Interpreter Core']
    creation = <Date 2011-06-21.03:50:10.369>
    creator = 'py.user'
    dependencies = []
    files = ['22764', '22779']
    hgrepos = []
    issue_num = 12380
    keywords = ['patch', 'needs review']
    message_count = 16.0
    messages = ['138769', '138784', '138805', '138807', '138809', '138811', '138841', '139373', '140495', '141118', '141128', '141145', '141253', '141264', '141326', '141327']
    nosy_count = 11.0
    nosy_names = ['rhettinger', 'terry.reedy', 'ncoghlan', 'pitrou', 'vstinner', 'r.david.murray', 'eli.bendersky', 'docs@python', 'py.user', 'python-dev', 'petri.lehtinen']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue12380'
    versions = ['Python 3.3']

    @py-user
    Copy link
    Mannequin Author

    py-user mannequin commented Jun 21, 2011

    >>> bytearray(b'abc').rjust(10, b'*')
    bytearray(b'*******abc')
    >>> bytearray(b'abc').rjust(10, bytearray(b'*'))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: must be a byte string of length 1, not bytearray
    >>>

    @py-user py-user mannequin added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jun 21, 2011
    @py-user py-user mannequin changed the title bytearray center, ljust, rjust don't accept a bytearray as the fill character bytearray methods center, ljust, rjust don't accept a bytearray as the fill character Jun 21, 2011
    @bitdancer
    Copy link
    Member

    What's the use case? I'm inclined to reject this as not needed.

    @bitdancer bitdancer added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Jun 21, 2011
    @py-user
    Copy link
    Mannequin Author

    py-user mannequin commented Jun 21, 2011

    all other methods support it and it's right

    >>> barr = bytearray(b'abcd*')
    >>> barr.center(len(barr) * 4, barr[-1:])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: must be a byte string of length 1, not bytearray
    >>> b = b'abcd*'
    >>> b.center(len(b) * 4, b[-1:])
    b'*******abcd*********'
    >>>

    @bitdancer
    Copy link
    Member

    A bytearray is for working with mutable data. We don't support using it in all places that the non-mutable data types can be used. You can code your example like this:

    barr.center(len(barr) * 4, bytes([barr[-1]]))

    I realize that isn't particularly pretty, but that has more to do with the fact that indexing bytes gives you ints in Python 3 than it does with whether or not bytearray is accepted.

    The data type of the arguments to the method have no necessary relationship with the datatype of the object.

    You may have more success arguing that the fill character for both bytearray and bytes should be allowed to be an int.

    I think this whole topic is better addressed in a forum such as python-ideas. I agree that the bytes interface is a bit wonky in places, but I think that if changes are going to be made a consensus needs to be developed on what changes to make. I believe some conversations about this have already taken place, and so far I don't think there are any consensus proposals.

    So, I'm going to close this issue. But please join (or start, if necessary) the discussion on this wider topic in the appropriate forum.

    @py-user
    Copy link
    Mannequin Author

    py-user mannequin commented Jun 21, 2011

    A bytearray is for working with mutable data. We don't support using > it in all places that the non-mutable data types can be used.

    >>> bytearray(b'abcd').strip(bytearray(b'da'))
    bytearray(b'bc')
    >>>

    .translate, .find, .partition, ...

    >>> bytearray(b'.').join((bytearray(b'a'), bytearray(b'b')))
    bytearray(b'a.b')
    >>> bytearray(b'.').join((b'a', b'b'))
    bytearray(b'a.b')
    >>>

    all these methods could use only bytes objects

    @bitdancer
    Copy link
    Member

    All right, let's get some other opinions from people who have actually worked with the bytearray and bytes code (and Terry because he cares about APIs).

    @bitdancer bitdancer reopened this Jun 22, 2011
    @terryjreedy
    Copy link
    Member

    After thinking about this awhile, I see the key sentence of David's reply as "The data type of the arguments to the method have no necessary relationship with the datatype of the object." While true in general, in it not true with respect to corresponing text (string) and byte(array) methods. String parameters of strings methods become byte parameters of byte(array) methods. In the other hand, I think I agree with David's application to byte versus bytearray methods. I might change my mind after further examination of the methods in question. But for the present, I would not change the code.

    Or would I? Here is a reason not to change. Example:

    for byt in (b'abc', bytearray(b'cdef'), b'xye')
    yield byt.rjust(10,b'-')

    Making the type of constant args depend on the type of the base object would make generic byte/bytearray functions more difficult. We already have this problem with writing functions that work with bytes and text in 3.x. It is a big nuisance that is only justified by the benefits of not mixing bytes and text. I do not think we should extend the nuisance to byte and bytearray functions, especially without a strong use case.

    I marked this for 'documentation' because I think the doc for some of the str methods might be improved and that the reference to them in the bytes/bytearray definitely needs more. Doc changes would apply to 3.2 also.

    "Bytes and bytearray objects, being “strings of bytes”, have all methods found on strings, with the exception ... "

    should be followed by something like.

    "If the string method has a string parameter, the corresponding byte/bytearray method has a corresponding byte parameter."

    (to match the reported current behavior).

    I have not yet looked at doc strings. I did not unmark 'Interpreter core' because I have not looked at all of p.u's examples to be sure that I like *all* of the current behaviors.

    @terryjreedy terryjreedy added the docs Documentation in the Doc dir label Jun 23, 2011
    @pitrou
    Copy link
    Member

    pitrou commented Jun 28, 2011

    I do agree it is a nuisance that it doesn't work with bytearray instances. After all, these methods are supposed to be homogeneous, and they are when called on a str or bytes object.

    @pitrou pitrou removed the docs Documentation in the Doc dir label Jun 28, 2011
    @elibendersky
    Copy link
    Mannequin

    elibendersky mannequin commented Jul 16, 2011

    On one hand, I agree that the situation isn't intuitive. Why should some methods of bytearray accept bytearrays, and some shouldn't?

    On the other hand, this actually has rather deep implementation reasons.

    Methods like 'translate' are implemented in Objects/bytearrayobject.c

    On the other hand, ljust, rjust and center are taken from stringlib. Now, stringlib is generic code, and has some strict argument checking. For example, in stringlib_ljust:

        if (!PyArg_ParseTuple(args, "n|c:ljust", &width, &fillchar))
            return NULL;

    The 'c' format to PyArg_ParseTuple expects an object that passes PyBytes_Check, IOW a bytes object or a subclass thereof. bytearray is not a subclass of bytes, hence the problem.

    The solution could be global, to allow bytearray fit the 'c' format of PyArg_ParseTuple. Then one would also be able to pass a bytearray into other stringlib methods requiring the 'c' format.

    One way or the other, this is of course doable. A decision has to be made though - is the nuisance annoying enough to warrant such an API change?

    @akheron
    Copy link
    Member

    akheron commented Jul 25, 2011

    The solution could be global, to allow bytearray fit the 'c' format of
    PyArg_ParseTuple. Then one would also be able to pass a bytearray into
    other stringlib methods requiring the 'c' format.

    Another possibility would be the change the 'c' format so that it accepts any object that supports the buffer protocol and whose buffer length is 1.

    Attaching two patches: The first allows bytes and bytearray, the second allows any object that supports the buffer protocol.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 25, 2011

    c_format_bytearray.patch looks ok to me. The other proposal is too broad, and may lead to confusing behaviour.
    In any case, some tests are needed.

    @akheron
    Copy link
    Member

    akheron commented Jul 26, 2011

    Updated the bytearray patch to change documentation and add tests.

    @elibendersky
    Copy link
    Mannequin

    elibendersky mannequin commented Jul 27, 2011

    Looks good. How about also adding some tests for the original request of supporting bytearrays in ljust/rjust/center?

    @akheron
    Copy link
    Member

    akheron commented Jul 27, 2011

    Updated the patch to add tests for {bytes,bytearray}.{center,ljust,rjust}. The tests check that both bytes and bytearray are always accepted as the fill character.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jul 29, 2011

    New changeset 536fccc75f5a by Eli Bendersky in branch 'default':
    Issue bpo-12380: PyArg_ParseTuple now accepts a bytearray for the 'c' format.
    http://hg.python.org/cpython/rev/536fccc75f5a

    @elibendersky
    Copy link
    Mannequin

    elibendersky mannequin commented Jul 29, 2011

    Petri, thanks for the patch. I've updated Misc/NEWS and committed it.

    Unless there are objections or problems, I will close this issue in a day or two.

    @elibendersky elibendersky mannequin closed this as completed Jul 30, 2011
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants