classification
Title: getargs.c in Python3 contains some TODO and the documentation is outdated
Type: Stage:
Components: Interpreter Core Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: duplicate
Dependencies: Superseder:
Assigned To: Nosy List: barry, benjamin.peterson, brett.cannon, georg.brandl, vstinner
Priority: normal Keywords: patch

Created on 2010-03-23 22:29 by vstinner, last changed 2010-06-13 20:45 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
getarg_cleanup_buffer.patch vstinner, 2010-05-29 00:07
doc_capi_arg.patch vstinner, 2010-05-29 02:34
Messages (7)
msg101606 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-23 22:28
http://docs.python.org/py3k/c-api/arg.html contains some ambiguous (string or Unicode object) definitions: what is a string? what is an unicode object? Is it a string or not? The problem is that the documentation is for Python2: the code was changed, but not the documentation. I think that it can be replaced by (unicode objet) with lower U to be consistent with (bytes object).

---

There are two functions: getbuffer() and convertbuffer().

getbuffer(): pb=arg->ob_type->tp_as_buffer
 - if pb->bf_getbuffer is not NULL: call PyObject_GetBuffer(arg, view, PyBUF_SIMPLE) and PyBuffer_IsContiguous(view, 'C')
 - if pb->bf_getbuffer is NULL: call convertbuffer()

convertbuffer() calls PyObject_GetBuffer(arg, &view, PyBUF_SIMPLE).

---

"s#", "y", "z" formats use convertbuffer()

"s", "y*", "z*" formats uses getbuffer().

"t" format reimplements convertbuffer().

"w*" format calls PyObject_GetBuffer(arg, (Py_buffer*)p, PyBUF_WRITABLE) and PyBuffer_IsContiguous((Py_buffer*)p, 'C').

"w" and "w#" formats call PyObject_GetBuffer(arg, &view, PyBUF_SIMPLE).

I think that all these cases should be factorized in one unique function.

Is it a bug, or functions using "s#", "y", "z", "t" formats do really support discontinious buffers?

Related PEP: http://www.python.org/dev/peps/pep-3118/
msg101607 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-03-23 22:40
See also issue #2322.
msg102266 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-04-03 14:53
The PEP 3118 describes some points about discontigious buffers, but there is no module nor third party libraries supporting them.

PIL 1.1.7 (the last version) doesn't support the buffer API (an image can not be "exported" as a buffer, but PIL accepts a buffer as input in some functions).
msg104993 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-05 01:09
See also #8592.
msg106699 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-29 00:07
Patch to cleanup getbuffer() and convertbuffer():
 - getbuffer() doesn't call convertbuffer() if pb->bf_getbuffer==NULL. If pb->bf_getbuffer==NULL, PyObject_GetBuffer() fails and so the call to convertbuffer() is useless.
 - convertbuffer() calls getbuffer() to check that the buffer is 'C" contiguous (and to factorize the code)
 - release the buffer if the buffer is not contigous => fix a bug
 - rename "errmsg" and "buf" to "expected" to reuse converterror() term
 - Remove /* XXX Really? */: I don't understand the comment and the code looks ok

The main change is that convertbuffer() now requires a "C" contiguous buffer. That change concerns "s#", "y", "z" and "t#" formats.

If a function would like to support non contiguous buffers, it should use "O" format and then PyObject_GetBuffer(). I don't think that builtin Python functions do support non contiguous buffers.
msg106703 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-05-29 02:34
Patch to update the documenation, especially input types for PyArg_ParseTuple() and output types for Py_BuildValue():

 - add bytes and/or bytearray when buffer compatible object is accepted to be more explicit
 - "es", "et", "es#", "et#" don't accept buffer compatible objects: fix the doc
 - specify utf-8 encoding
 - mark "U" and "U#" as deprecated: see issue #8848
 - fix ".. note::" syntax
 - replace "Unicode" by "str"
 - replace "bytes object" by "bytes"
 - replace "any buffer compatible object" by "buffer compatible object"

I hesitate to add explicit quotes, eg. "s" instead of s.
msg107750 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010-06-13 20:45
r81811 and r81923 improve Doc/c-api/arg.rst. I opened more specific issues to improve getargs code and documentation:
 - #8991: PyArg_Parse*() functions: reject discontinious buffers
 - #8926: getargs.c: release the buffer on error
 - #8952: Doc/c-api/arg.rst: fix documentation of number formats
 - #8949: PyArg_Parse*(): "z" should not accept bytes
 - #8951: PyArg_Parse*(): factorize code of 's' and 'z' formats, and 'u' and 'Z' formats	open
 - #8850: Remove "w" format of PyParse_ParseTuple()

I prefer to close this issue because it is too generic. All points of this issues can be found in the other issues or are already fixed by some commits.
History
Date User Action Args
2010-06-13 20:45:21vstinnersetstatus: open -> closed
resolution: duplicate
messages: + msg107750
2010-05-29 02:34:37vstinnersetfiles: + doc_capi_arg.patch

messages: + msg106703
2010-05-29 00:07:18vstinnersetfiles: + getarg_cleanup_buffer.patch
keywords: + patch
messages: + msg106699
2010-05-05 01:09:36vstinnersetmessages: + msg104993
2010-04-22 11:39:00vstinnersetnosy: + barry, brett.cannon, georg.brandl, benjamin.peterson
2010-04-03 14:53:25vstinnersetmessages: + msg102266
2010-03-23 22:40:14vstinnersetmessages: + msg101607
2010-03-23 22:29:00vstinnercreate