This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author lemburg
Recipients benjamin.peterson, georg.brandl, larry, lemburg
Date 2015-04-17.00:35:26
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <55305546.9030001@egenix.com>
In-reply-to <1429227526.01.0.733275706487.issue23980@psf.upfronthosting.co.za>
Content
On 17.04.2015 01:38, Larry Hastings wrote:
> 
> Documentation is here:
> 
>     https://docs.python.org/3/c-api/arg.html#arg-parsing
> 
> 
> The first line of documentation for each format unit follows this convention:
>     formatunit (pythontype) [arguments, to, pyarg_parsetuple]
> 
> These represent the format unit itself, followed by the Python type it consumes in parentheses, followed by the C types it outputs in square brackets.  Thus
>     i (int) [int]
> means the format unit is 'i', it consumes a Python 'int', and it produces a C 'int'.  Similarly,
>     s (str) [const char *]
> means the format unit is 's', it consumes a Python 'str', and it produces a C 'const char *'.
> 
> When you call PyArg_ParseTuple (AndKeywords), you pass in a pointer to the thing you expect.  If it gives you an int, you pass in &my_int. So the type of the expression you pass in for 'i' is actually "int *".  And the type you pass in for 's' is actually "char **".
> 
> The format units that deal with encodings are a bit weirder.  You actually pass in a const char * string first, followed by the buffer you want to write data too.  Technically the types of the values you pass in for "es" are "const char *, char **".  But the documentation for es says
>     es (str) [const char *encoding, char **buffer]

You need to pass in a variable which will then be set up to point to a
buffer which will be written too :-)

The "e" variants (typically) allocate a buffer for you, since it's pretty
much unknown how long the encoded data will be.

> This led me to believe that I actually had to pass in a "char ***" for buffer!  Which is wrong and doing so makes your programs explode-y.

Indeed :-)

> The documentation should
> 
> * explain this first-line convention precisely, and
> 
> * use the types consistently.
> 
> My suspicion is that the things in brackets have to be the precise C type, e.g. "int *" for i, "char **" for s, "const char *, char **" for es.

The paragraph under "Parsing argument" says:

"""
In the following description, the quoted form is the format unit; the entry in (round) parentheses
is the Python object type that matches the format unit; and the entry in [square] brackets is the
type of the C variable(s) whose address should be passed.
"""

So I guess the "e" descriptions need to have the additional * removed
or the paragraph has to be updated and all other listings need
to be converted to precise types (that would be my preference).

I wonder why no one has noticed in all these years. I apparently had
understood the listings back to be precise C types back in the days
I added the documentation for the "e" codes:
https://hg.python.org/cpython-fullhistory/rev/3ae06c57d09e).

The descriptions for the codes do clarify what is going on, though.
History
Date User Action Args
2015-04-17 00:35:29lemburgsetrecipients: + lemburg, georg.brandl, larry, benjamin.peterson
2015-04-17 00:35:29lemburglinkissue23980 messages
2015-04-17 00:35:26lemburgcreate