Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

%b format for bytes does not support objects that follow the buffer protocol #73042

Closed
abalkin opened this issue Dec 2, 2016 · 12 comments
Closed
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@abalkin
Copy link
Member

abalkin commented Dec 2, 2016

BPO 28856
Nosy @abalkin, @skrah, @ethanfurman, @serhiy-storchaka, @zhangyangyu
PRs
  • bpo-28856: Let %b format for bytes support objects that follow the buffer protocol #546
  • [3.6] bpo-28856: Let %b format for bytes support objects that follow the buffer protocol  #664
  • [Do Not Merge] Sample of CPython life with blurb. #703
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-03-14.07:36:34.945>
    created_at = <Date 2016-12-02.02:30:44.266>
    labels = ['interpreter-core', 'type-bug', '3.7']
    title = '%b format for bytes does not support objects that follow the buffer protocol'
    updated_at = <Date 2017-03-24.22:20:23.349>
    user = 'https://github.com/abalkin'

    bugs.python.org fields:

    activity = <Date 2017-03-24.22:20:23.349>
    actor = 'xiang.zhang'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-03-14.07:36:34.945>
    closer = 'xiang.zhang'
    components = ['Interpreter Core']
    creation = <Date 2016-12-02.02:30:44.266>
    creator = 'belopolsky'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 28856
    keywords = []
    message_count = 12.0
    messages = ['282215', '289159', '289172', '289173', '289520', '289533', '289540', '289542', '289570', '289571', '290187', '290188']
    nosy_count = 5.0
    nosy_names = ['belopolsky', 'skrah', 'ethan.furman', 'serhiy.storchaka', 'xiang.zhang']
    pr_nums = ['546', '664', '703']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue28856'
    versions = ['Python 3.6', 'Python 3.7']

    @abalkin
    Copy link
    Member Author

    abalkin commented Dec 2, 2016

    Python 3.7.0a0 (default:be70d64bbf88, Dec  1 2016, 21:21:25)
    [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from array import array
    >>> a = array('B', [1, 2])
    >>> b'%b' % a
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: %b requires bytes, or an object that implements __bytes__, not 'array.array'
    >>> m = memoryview(a)
    >>> b'%b' % m
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: %b requires bytes, or an object that implements __bytes__, not 'memoryview'

    Accorfing to documentation 1 objects that follow the buffer protocol should be supported. Both array.array and memoryview follow the buffer protocol.

    See also bpo-20284 and PEP-461.

    @abalkin abalkin added the type-bug An unexpected behavior, bug, or error label Dec 2, 2016
    @zhangyangyu zhangyangyu added 3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Mar 7, 2017
    @serhiy-storchaka
    Copy link
    Member

    printf-style bytes formatting was added mainly for increasing compatibility with Python 2. It was restricted to support mostly features existing in Python 2.

    '%s' formatting in Python 3 supports bytes-like objects partially:

    >>> b'%s' % array('B', [1, 2])
    "array('B', [1, 2])"
    >>> b'%s' % buffer(array('B', [1, 2]))
    '\x01\x02'
    >>> b'%s' % memoryview(array('B', [1, 2]))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: cannot make memory view because object does not have the buffer interface
    >>> b'%s' % bytearray(b'abc')
    'abc'
    >>> b'%s' % buffer(bytearray(b'abc'))
    'abc'
    >>> b'%s' % memoryview(bytearray(b'abc'))
    '<memory at 0xb70902ac>'

    I don't know whether there is a need of supporting the buffer protocol in printf-style bytes formatting. bytearray is already supported, buffer() doesn't exist in Python 3, memoryview() is not supported in Python 2. Seems this doesn't add anything for increasing the compatibility.

    @zhangyangyu
    Copy link
    Member

    Isn't this a discussed behaviour that is explicitly documented in PEP-461?

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Mar 7, 2017

    For '%b', it looks like the PEP supports it. I didn't follow the PEP discussions, I think Ethan will know more.

    @zhangyangyu
    Copy link
    Member

    What's your opinions Alexander and Ethan?

    @serhiy-storchaka
    Copy link
    Member

    Sometimes the implementation can expose drawbacks of initial design. I don't know whether there was good reason for omitting the support of the buffer protocol (in that case the PEP should be updated) or this is just an oversign. We should ask Ethan about this.

    The change proposed by Xiang looks correct, but not very efficient. It makes one redundant copy of the data. More efficient implementation will complicate the code, and that can hit the performance of other cases.

    @ethanfurman
    Copy link
    Member

    I suspect it was a simple oversight, and should be added now. Since it's been missing for so long I think we should put it in 3.7, maybe put it in 3.6 (maybe not, since it has two point releases out now), but definitely not in 3.5.

    @abalkin
    Copy link
    Member Author

    abalkin commented Mar 13, 2017

    @xiang.zhang - I am the OP for this issue, so naturally I expect this to be fixed. I have a work-around in place for my own code, so I have no opinion on the particular versions. I guess the normal policy on bug fixes should apply.

    @serhiy-storchaka
    Copy link
    Member

    Following example copies the entire buffer object while copying only smart part is needed:

        m = memoryview(b'x'*10**6)
        b'%.100b' % m

    I don't know whether this is important use case that is worth an optimization. The workaround is using slicing rather than truncating in format:

    b'%b' % m[:100]
    

    Or in the case of general buffer object:

    b'%b' % memoryview(m).cast('B')[:100]
    

    But in that case it is not hard to add an explicit conversion to bytes.

    b'%b' % bytes(memoryview(m).cast('B')[:100])
    

    @zhangyangyu
    Copy link
    Member

    I committed the suboptimal patch. I close this issue now and if there is any enhancement solution, let's make it another issue. Thank you all.

    @zhangyangyu
    Copy link
    Member

    New changeset faa2cc6 by Xiang Zhang in branch '3.6':
    bpo-28856: Let %b format for bytes support objects that follow the buffer protocol (GH-664)
    faa2cc6

    @zhangyangyu
    Copy link
    Member

    New changeset 7e2a54c by Xiang Zhang in branch 'master':
    bpo-28856: Let %b format for bytes support objects that follow the buffer protocol (GH-546)
    7e2a54c

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants