classification
Title: add "buffer protocol" to glossary
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: chris.jerdonek, docs@python, eric.araujo, ezio.melotti, flox, georg.brandl, pitrou, python-dev, r.david.murray, rhettinger, serhiy.storchaka, skrah, terry.reedy
Priority: normal Keywords: patch

Created on 2012-11-21 06:03 by chris.jerdonek, last changed 2014-10-31 19:08 by ezio.melotti. This issue is now closed.

Files
File name Uploaded Description Edit
issue16518.diff ezio.melotti, 2013-04-29 17:32 review
issue16518-2.diff ezio.melotti, 2013-05-01 11:40 Patch to use "bytes-like object" in throughout the docs review
issue16518-3.diff ezio.melotti, 2013-05-04 15:27
issue16518-4.diff ezio.melotti, 2013-05-05 19:23
Messages (34)
msg176042 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-11-21 06:03
This issue is to add "buffer protocol" (or perhaps "buffer object") to the glossary.  The concept is currently described here:

http://docs.python.org/dev/c-api/buffer.html#buffer-protocol

Éric initially suggested doing this in the comments to issue 13538.

Such a glossary entry would be useful because the buffer protocol (or buffer object) should likely be cited, for example, wherever a function accepts a bytes object, bytearray object, or any object that supports the buffer protocol.  The str() constructor is one example where this is done:

http://hg.python.org/cpython/file/59acd5cac8b5/Doc/library/functions.rst#l1275

"Buffer object" might be the more useful term to add to the glossary because it would help to have a briefer way of saying "any object that supports the buffer protocol."  (I'm assuming this is what "buffer object" actually means.)

The patch for this issue should also do a comprehensive review of occurrences of buffer object/protocol throughout the docs and add or update links and index entries where appropriate.
msg176238 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-11-23 20:16
I would use the term that is currently used in some error messages.
msg176242 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-23 20:26
"Buffer protocol" is the right term. "Buffer object" doesn't mean anything in Python 3 and, furthermore, it might be mixed up with the Python 2 `buffer` type.

As for the error messages, they are generally very bad on this topic, so I would vote to change them :-)
msg176244 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-11-23 20:28
Do we have a recommended (and preferably briefer) way of saying, "any object that supports the buffer protocol"?
msg176245 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-11-23 20:29
s/any//
msg176247 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-11-23 20:30
> "Buffer object" doesn't mean anything in Python 3 and, furthermore,
> it might be mixed up with the Python 2 `buffer` type.

Agreed.

> As for the error messages, they are generally very bad on this topic,
> so I would vote to change them :-)

I would say that they are verbose maybe, but not necessary bad.
Using "any object that supports the buffer protocol" without explicitly mentioning bytes (and bytearray) might end up being even more confusing (if that's what it's being proposed).
msg176248 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-23 20:32
> Do we have a recommended (and preferably briefer) way of saying, "any
> object that supports the buffer protocol"?

It depends where. There's no recommended way yet, but I would vote for
"bytes-like object" in error messages that are targetted at the average
developer.

The docs (glossary?) could explain that "bytes-like object" is the same
as "buffer-providing object" or "object implementing the buffer
protocol".
msg176249 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-11-23 20:33
> I would vote for "bytes-like object"

Sounds like a good compromise between brevity and clarity to me.
msg176251 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-11-23 20:49
I wouldn't use "bytes-like object". One can certainly argue that *memoryview*
should be bytes-like as a matter of preference, but the buffer protocol
specifies strongly (or even statically) typed multi-dimensional arrays.

PEP-3118 Py_buffer structs are essentially how NumPy works internally.
msg176252 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-23 20:58
> I wouldn't use "bytes-like object". One can certainly argue that *memoryview*
> should be bytes-like as a matter of preference, but the buffer protocol
> specifies strongly (or even statically) typed multi-dimensional arrays.

Ach :-(

> PEP-3118 Py_buffer structs are essentially how NumPy works internally.

Well, we should still write a Python documentation, not a NumPy
documentation (on this tracker anyway). Outside of NumPy, there's little
use for multi-dimensional objects.
msg176253 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-11-23 21:18
> I wouldn't use "bytes-like object".

What about "buffer-like object"?
msg176254 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-23 21:21
> > I wouldn't use "bytes-like object".
> 
> What about "buffer-like object"?

"buffer-like" means "like a buffer" which is wrong on two points:
- "buffer" is not defined at this point, so the user doesn't understand
what it means
- we are not talking about an object which is "like a buffer", but which
"provides a buffer"
msg176256 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-11-23 21:54
Antoine Pitrou <report@bugs.python.org> wrote:
> > PEP-3118 Py_buffer structs are essentially how NumPy works internally.
> 
> Well, we should still write a Python documentation, not a NumPy
> documentation (on this tracker anyway). Outside of NumPy, there's little
> use for multi-dimensional objects.

Ok, but people should not be surprised if their (Python) array.array() of
double or their array of ctypes structs is silently accepted by some byte
consuming function.

How about "object does not provide a byte buffer" for error messages
and "(byte) buffer provider" as a shorthand for "any buffer provider
that exposes its memory as a sequence of unsigned bytes in response
to a PyBUF_SIMPLE request"?
msg176257 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-11-23 21:57
> > Well, we should still write a Python documentation, not a NumPy
> > documentation (on this tracker anyway). Outside of NumPy, there's little
> > use for multi-dimensional objects.
> 
> Ok, but people should not be surprised if their (Python) array.array() of
> double or their array of ctypes structs is silently accepted by some byte
> consuming function.

Probably. My own (humble :-)) opinion is that array.array() is a
historical artifact, and its use doesn't seem to be warranted in modern
Python code. ctypes is obviously a very special library, and not for the
faint of heart.

> How about "object does not provide a byte buffer" for error messages
> and "(byte) buffer provider" as a shorthand for "any buffer provider
> that exposes its memory as a sequence of unsigned bytes in response
> to a PyBUF_SIMPLE request"?

It's not too bad, I think. However, what I think is important is that
the average (non-expert) Python developer understand that the function
really accepts a bytes object, and other similar types (because, really,
bytes is the only bytes-like type most developers will ever face).
That's why I'm proposing "bytes-like object".
msg176262 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012-11-23 23:18
Antoine Pitrou <report@bugs.python.org> wrote:
> > How about "object does not provide a byte buffer" for error messages
> > and "(byte) buffer provider" as a shorthand for "any buffer provider
> > that exposes its memory as a sequence of unsigned bytes in response
> > to a PyBUF_SIMPLE request"?
> 
> It's not too bad, I think. However, what I think is important is that
> the average (non-expert) Python developer understand that the function
> really accepts a bytes object, and other similar types (because, really,
> bytes is the only bytes-like type most developers will ever face).
> That's why I'm proposing "bytes-like object".

If it is somehow possible to establish the term as a shorthand for the real
meaning, then I guess it's the most economical option for documenting Python
methods (I don't think it should be used in the C-API docs though).

help (b''.join) for example would sound better with "bytes-like object"
than with "(byte) buffer provider".
msg176264 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-11-24 01:33
> > That's why I'm proposing "bytes-like object".
>
> If it is somehow possible to establish the term as a shorthand for the real
meaning,

This can be established via the glossary.  We can still use "buffer provider" for the general case, if we find that it is useful in certain circumstances.
msg177801 - (view) Author: Chris Jerdonek (chris.jerdonek) * (Python committer) Date: 2012-12-20 07:39
After this issue is resolved, the binascii docs can be updated as suggested in issue 16724.
msg188078 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-04-29 17:32
Here's a patch that adds "bytes-like object" to the glossary, links to the buffer protocol docs[0] and provides bytes and bytearray as examples.

[0]: http://docs.python.org/dev/c-api/buffer.html#buffer-protocol
msg188183 - (view) Author: Roundup Robot (python-dev) Date: 2013-04-30 20:35
New changeset 474f28bf67b3 by Ezio Melotti in branch '3.3':
#16518: add "bytes-like object" to the glossary.
http://hg.python.org/cpython/rev/474f28bf67b3

New changeset 747cede24367 by Ezio Melotti in branch 'default':
#16518: merge with 3.3.
http://hg.python.org/cpython/rev/747cede24367

New changeset 1b92a0112f5d by Ezio Melotti in branch '2.7':
#16518: add "bytes-like object" to the glossary.
http://hg.python.org/cpython/rev/1b92a0112f5d
msg188207 - (view) Author: Roundup Robot (python-dev) Date: 2013-05-01 11:13
New changeset d1aa8a9eba44 by Ezio Melotti in branch '2.7':
#16518: fix links in glossary entry.
http://hg.python.org/cpython/rev/d1aa8a9eba44
msg188208 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-05-01 11:40
The attached patch replaces things like "object that support the buffer protocol/interface/API" with "bytes-like objects" throughout the docs.
The patch doesn't change error messages/docstrings.

I also noticed that on 2.7[0], the section about the buffer protocol in Doc/c-api/buffer.rst is called "Buffers and Memoryview Objects" and it's not as clear as the one on 3.x[1].  Should this section be backported?

[0]: http://docs.python.org/2.7/c-api/buffer.html#bufferobjects
[1]: http://docs.python.org/dev/c-api/buffer.html#bufferobjects
msg188209 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-05-01 11:42
> I also noticed that on 2.7[0], the section about the buffer protocol
> in Doc/c-api/buffer.rst is called "Buffers and Memoryview Objects" and
> it's not as clear as the one on 3.x[1].  Should this section be
> backported?

The "buffer protocol" situation is different on 2.x, please let's
concentrate on 3.x :-)
msg188368 - (view) Author: Roundup Robot (python-dev) Date: 2013-05-04 15:07
New changeset 003e4eb92683 by Ezio Melotti in branch '3.3':
#16518: use "bytes-like object" throughout the docs.
http://hg.python.org/cpython/rev/003e4eb92683

New changeset d4912244cce6 by Ezio Melotti in branch 'default':
#16518: merge with 3.3.
http://hg.python.org/cpython/rev/d4912244cce6
msg188369 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-05-04 15:27
The attached patch uses "bytes-like objects" in the error messages.
msg188404 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-05-04 20:23
> The attached patch uses "bytes-like objects" in the error messages.

I'm surprised your patch doesn't touch Python/getargs.c.
msg188406 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-05-04 20:48
FWIW I was grepping for buffer protocol/interface/api, and then double-checking for "buffer" in the resulting files.  Python/getargs.c doesn't seem to mention the buffer protocol/interface/api at all.
msg188453 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-05-05 19:23
Updated patch to include getargs.c too.
msg188484 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013-05-06 03:00
At first-reading, it looks like matters were made more confusing with "bytes-like object" as a defined term.
msg188485 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2013-05-06 03:03
Can you elaborate?
msg228585 - (view) Author: Roundup Robot (python-dev) Date: 2014-10-05 15:48
New changeset e7e8a218737a by R David Murray in branch 'default':
#16518: Bring error messages in harmony with docs ("bytes-like object")
https://hg.python.org/cpython/rev/e7e8a218737a
msg228587 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-10-05 16:15
Committed the message changes to 3.5 only, since it will probably cause tests to fail in various projects, despite messages not being a formal part of the python API.  

Per IRC conversation with Ezio and Antoine, I posted a note to python-dev to let people know we now have a consistent terminology in the docs and error messages, and to provide a last opportunity for objections (it is easy enough to back the patch out if there is an outcry, but I don't expect one).
msg228596 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-10-05 17:40
There are other unfixed messages (may be introduced after 3.3):

>>> b''.join([''])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence item 0: expected bytes, bytearray, or an object with the buffer interface, str found
>>> str(42, 'utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: coercing to str: need bytes, bytearray or buffer-like object, int found
>>> import array; array.array('B').frombytes(array.array('I'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string/buffer of bytes required.
>>> import socket; print(socket.socket.sendmsg.__doc__)
sendmsg(buffers[, ancdata[, flags[, address]]]) -> count

Send normal and ancillary data to the socket, gathering the
non-ancillary data from a series of buffers and concatenating it into
a single message.  The buffers argument specifies the non-ancillary
data as an iterable of buffer-compatible objects (e.g. bytes objects).
The ancdata argument specifies the ancillary data (control messages)
as an iterable of zero or more tuples (cmsg_level, cmsg_type,
cmsg_data), where cmsg_level and cmsg_type are integers specifying the
protocol level and protocol-specific type respectively, and cmsg_data
is a buffer-compatible object holding the associated data.  The flags
argument defaults to 0 and has the same meaning as for send().  If
address is supplied and not None, it sets a destination address for
the message.  The return value is the number of bytes of non-ancillary
data sent.

And there are several mentions of "buffer-like" or "buffer-compatible" in the documentation.
msg228694 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2014-10-06 14:25
Please open a new issue for those.
msg230380 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2014-10-31 19:08
See #22581.
History
Date User Action Args
2016-04-16 16:14:57serhiy.storchakalinkissue17859 superseder
2014-10-31 19:08:28ezio.melottisetmessages: + msg230380
2014-10-06 14:25:25georg.brandlsetnosy: + georg.brandl
messages: + msg228694
2014-10-05 17:40:56serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg228596
2014-10-05 16:15:14r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg228587

resolution: fixed
stage: commit review -> resolved
2014-10-05 15:48:01python-devsetmessages: + msg228585
2013-05-06 03:03:19ezio.melottisetmessages: + msg188485
2013-05-06 03:00:28rhettingersetnosy: + rhettinger
messages: + msg188484
2013-05-05 19:23:38ezio.melottisetfiles: + issue16518-4.diff

messages: + msg188453
stage: patch review -> commit review
2013-05-04 20:48:23ezio.melottisetmessages: + msg188406
2013-05-04 20:23:50pitrousetmessages: + msg188404
2013-05-04 15:27:32ezio.melottisetfiles: + issue16518-3.diff

messages: + msg188369
2013-05-04 15:07:30python-devsetmessages: + msg188368
2013-05-01 11:42:15pitrousetmessages: + msg188209
2013-05-01 11:40:59ezio.melottisetfiles: + issue16518-2.diff

messages: + msg188208
2013-05-01 11:13:20python-devsetmessages: + msg188207
2013-04-30 20:35:17python-devsetnosy: + python-dev
messages: + msg188183
2013-04-29 17:32:03ezio.melottisetfiles: + issue16518.diff
versions: + Python 2.7, - Python 3.2
messages: + msg188078

keywords: + patch
stage: needs patch -> patch review
2013-04-28 08:46:05floxsetnosy: + flox
2012-12-20 07:39:34chris.jerdoneksetmessages: + msg177801
2012-11-24 01:33:26chris.jerdoneksetmessages: + msg176264
2012-11-23 23:18:03skrahsetmessages: + msg176262
2012-11-23 21:57:42pitrousetmessages: + msg176257
2012-11-23 21:54:13skrahsetmessages: + msg176256
2012-11-23 21:21:06pitrousetmessages: + msg176254
2012-11-23 21:18:31chris.jerdoneksetmessages: + msg176253
2012-11-23 20:58:44pitrousetmessages: + msg176252
2012-11-23 20:49:25skrahsetnosy: + skrah
messages: + msg176251
2012-11-23 20:33:42ezio.melottisetmessages: + msg176249
2012-11-23 20:32:34pitrousetmessages: + msg176248
2012-11-23 20:30:44ezio.melottisetmessages: + msg176247
2012-11-23 20:29:03chris.jerdoneksetmessages: + msg176245
2012-11-23 20:28:29chris.jerdoneksetmessages: + msg176244
2012-11-23 20:26:14pitrousetmessages: + msg176242
2012-11-23 20:16:02terry.reedysetnosy: + terry.reedy
messages: + msg176238
2012-11-23 17:19:13ezio.melottisetstage: needs patch
2012-11-21 06:03:27chris.jerdonekcreate