Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document documentation conventions for optional args #57595

Closed
ezio-melotti opened this issue Nov 12, 2011 · 28 comments
Closed

Document documentation conventions for optional args #57595

ezio-melotti opened this issue Nov 12, 2011 · 28 comments
Labels
docs Documentation in the Doc dir type-feature A feature request or enhancement

Comments

@ezio-melotti
Copy link
Member

BPO 13386
Nosy @birkenfeld, @rhettinger, @terryjreedy, @ericvsmith, @ezio-melotti, @merwok, @cjerdonek, @ericsnowcurrently, @akheron, @vadmium

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2011-11-12.06:08:40.576>
labels = ['type-feature', 'docs']
title = 'Document documentation conventions for optional args'
updated_at = <Date 2015-04-20.04:09:16.998>
user = 'https://github.com/ezio-melotti'

bugs.python.org fields:

activity = <Date 2015-04-20.04:09:16.998>
actor = 'rhettinger'
assignee = 'docs@python'
closed = False
closed_date = None
closer = None
components = ['Documentation']
creation = <Date 2011-11-12.06:08:40.576>
creator = 'ezio.melotti'
dependencies = []
files = []
hgrepos = []
issue_num = 13386
keywords = []
message_count = 26.0
messages = ['147469', '147471', '147484', '147521', '147577', '147584', '147585', '147586', '147587', '147594', '147604', '147630', '147654', '147667', '147668', '147670', '147671', '147704', '147705', '147827', '147831', '147832', '147833', '147835', '241590', '241600']
nosy_count = 12.0
nosy_names = ['georg.brandl', 'rhettinger', 'terry.reedy', 'eric.smith', 'ezio.melotti', 'eric.araujo', 'chris.jerdonek', 'docs@python', 'eric.snow', 'baptiste.carvello', 'petri.lehtinen', 'martin.panter']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue13386'
versions = ['Python 2.7', 'Python 3.2', 'Python 3.3']

@ezio-melotti
Copy link
Member Author

AFAIU the conventions for optional argument in the doc are as follow:

If a function has optional arguments and it accepts keyword arguments, the "func(arg=default)" notation should be used, for example:
str.splitlines(keepends=False)

If a function has optional arguments but it doesn't accept keyword arguments, the "func([arg1])" notation is used instead. This should apply only to some C functions, for example:
str.strip([chars])

The notation "func([arg=default])" should never be used, and "func([arg])" should be used only when keyword args are not accepted.

These rules apply to both Python 2 and Python 3.

A thing that is still not clear is what to do in case the default value is a placeholder (like object(), None, -1) and the actual value is then computed in the function.

@ezio-melotti ezio-melotti added the docs Documentation in the Doc dir label Nov 12, 2011
@ericvsmith
Copy link
Member

To your last point, I think it's important to specify the default value placeholder (basically a sentinel) in the documentation. For example, if a function takes -1 to mean "all occurrences", then the caller needs to know how what value to pass in in order to let the function compute the value. This is especially true if it's cheaper for the function to compute the value instead of the caller.

I've run into this problem before, where I wanted to pass in some sentinel value and I had to read the source to figure out what it was. I think the function was in the standard library, but now I can't recall what it was.

@ezio-melotti
Copy link
Member Author

The problem is when the default placeholder is some unique object() or some _internal value (we had something similar with a socket timeout once).
Also for something like str.strip(), would you document chars=None or chars=" \n\r\t\v\f"?

@elibendersky
Copy link
Mannequin

elibendersky mannequin commented Nov 12, 2011

You should also explicitly specify what happens in several optional but not keyword args are needed. AFAIU the convention is:

   func(arg1, arg2[, opt1, opt2])

@elibendersky
Copy link
Mannequin

elibendersky mannequin commented Nov 13, 2011

Ezio, regarding your latest message:

"The problem is when the default placeholder is some unique object() or some _internal value (we had something similar with a socket timeout once)."

I hope this should be rare enough not to present a significant problem with the _convention_. Such cases can be reviewed specifically and the best way to document will be discussed per case.

"Also for something like str.strip(), would you document chars=None or chars=" \n\r\t\v\f"?"

I think it would be better to document chars=None, because this is a simple value the user can pass (if he wants to do it explicitly), without thinking (and forgetting) about the specific delimeters. That None actually means " \n\r\t\v\f" should be explicitly documented below the function signature, of course.

@ezio-melotti
Copy link
Member Author

You should also explicitly specify what happens in several optional but
not keyword args are needed. AFAIU the convention is:
func(arg1, arg2[, opt1, opt2])

IIUC that would mean that either you pass only arg1 and arg2, or you also pass both opt1 and opt2.
I think the correct notation for that is e.g.:
str.startswith(prefix[, start[, end]])

I also saw "func(foo[, bar][, baz])" for cases where either bar or baz can be passed, but since this requires keyword arguments, the "func(foo, bar=x, baz=y)" notation should be used instead, and the documentation should then explain that either one can be passed.

I also agree with what you said in your last message. What can't be expressed with a notation can always be described with words.

@elibendersky
Copy link
Mannequin

elibendersky mannequin commented Nov 14, 2011

What you say makes sense, now I just have to dig up where I saw instances of [, opt1, opt2]

If anything, this is another proof that such conventions must be agreed upon and meticulously documented.

@baptistecarvello
Copy link
Mannequin

baptistecarvello mannequin commented Nov 14, 2011

Hi all, here is a relevant user story. I'm afraid it won't help you much, but it highlights the importance of consistent conventions in doc.

My girlfriend is learning Python with no prior programing experience. She quite naturally got used to calling help(function), and noted the following:

  1. she naturally understood the meaning of the [opt] notation

  2. she did not understand the opt=default notation, as she didn't have a sufficient experience with Python to recognize the syntax

  3. even after learning what it meant, she still found that notation obscure and unappealing

  4. she got annoyed that two completely different notations where used for two very close concepts

  5. she got annoyed that there was no user-discoverable and user-understandable document introducing those notations (if there is one, my mistake :-)

I have no ovious solutions to the annoyances. Regarding 4), maybe the [opt=default] notation has something good after all: that it reminds of the [opt] one. And regarding 5), if there is a canonical document about documentation conventions, I could try to summarize it in a language aimed at beginners.

@ezio-melotti
Copy link
Member Author

Thanks for the feedback!

  1. she naturally understood the meaning of the [opt] notation

I guess this depends on her background, I've seen people trying to use [] in function calls because they saw them in the doc or confusing them for lists, so I guess that each notation has its pros and cons.

  1. she did not understand the opt=default notation, as she didn't
    have a sufficient experience with Python to recognize the syntax

I agree that at the beginning it could be a bit confusing, but keyword arguments are an important part of Python and it's among the first things that one should learn. After that it should be even more natural than [].

  1. even after learning what it meant, she still found that notation
    obscure and unappealing

...or maybe not. Can she say what in particular is obscure and unappealing?

  1. she got annoyed that two completely different notations where used
    for two very close concepts

This is a good point, and we are trying to move to the arg=default notation. Unfortunately there are still places that use the old notation. C functions that have optional arguments but don't accept keyword arguments are a bit unusual, and IIUC in most of the cases that's an implementation detail that could be removed.

  1. she got annoyed that there was no user-discoverable and
    user-understandable document introducing those notations (if there is > one, my mistake :-)

This brings ups another interesting point. These conventions will probably end up in the "documenting" section, that is aimed to doc writers. Do we need an introductory page aimed to the readers that explains the conventions used in the doc?

@merwok
Copy link
Member

merwok commented Nov 14, 2011

Do we need an introductory page aimed to the readers that explains
the conventions used in the doc?

Explaining notational conventions at the start of a technical reference sounds like a best practice to me.

@baptistecarvello
Copy link
Mannequin

baptistecarvello mannequin commented Nov 14, 2011

Le 14/11/2011 13:40, Ezio Melotti a écrit :

> 1) she naturally understood the meaning of the [opt] notation

I guess this depends on her background, I've seen people trying to use [] in function calls because they saw them in the doc or confusing them for lists, so I guess that each notation has its pros and cons.

agreed, the [] notation also has its dangers. But the current situation
doesn't avoid them, because users will meet both notations.

> 2) she did not understand the opt=default notation, as she didn't
> have a sufficient experience with Python to recognize the syntax

I agree that at the beginning it could be a bit confusing, but keyword arguments are an important part of Python and it's among the first things that one should learn. After that it should be even more natural than [].

the thing is, beginners need to use other people's functions before they
really get into writing their own. You need some practice with a syntax
before you are able to recognize it in another context.

> 3) even after learning what it meant, she still found that notation
> obscure and unappealing

...or maybe not. Can she say what in particular is obscure and unappealing?

I'd say the fact that the main information (that the argument is
optional) is not highlighted and only appears as a side-effect of having
a default. Inversely, a lot of importance is given to the value of the
default, which most users can ignore at first.

> 4) she got annoyed that two completely different notations where used
> for two very close concepts

This is a good point, and we are trying to move to the arg=default notation. Unfortunately there are still places that use the old notation. C functions that have optional arguments but don't accept keyword arguments are a bit unusual, and IIUC in most of the cases that's an implementation detail that could be removed.

That would would solve the problem for the stdlib, but other C libraries
also have optional arguments which don't accept keyword arguments (for
example NumPy ufuncs). Will converting to a keyword argument work for
all of them?

> 5) she got annoyed that there was no user-discoverable and
> user-understandable document introducing those notations (if there is > one, my mistake :-)

This brings ups another interesting point. These conventions will probably end up in the "documenting" section, that is aimed to doc writers. Do we need an introductory page aimed to the readers that explains the conventions used in the doc?

I would say we need one. It should probably also be part of the "help()"
tool, as the function prototype is the first information that
help(function) displays.

Cheers,
Baptiste

@ericsnowcurrently
Copy link
Member

> 4) she got annoyed that two completely different notations where used
> for two very close concepts

This is a good point, and we are trying to move to the arg=default
notation. Unfortunately there are still places that use the old
notation. C functions that have optional arguments but don't accept
keyword arguments are a bit unusual, and IIUC in most of the cases
that's an implementation detail that could be removed.

So would it be worth the effort to identify each such place in the built-ins/stdlib and eventually change them all? I've seen support for doing so in other tracker issues and think it's a good idea personally.

@elibendersky
Copy link
Mannequin

elibendersky mannequin commented Nov 15, 2011

""So would it be worth the effort to identify each such place in the built-ins/stdlib and eventually change them all? I've seen support for doing so in other tracker issues and think it's a good idea personally.""

Probably, if this will bring some added value in addition to being easier to document.

@baptistecarvello
Copy link
Mannequin

baptistecarvello mannequin commented Nov 15, 2011

Le 14/11/2011 20:51, Eric Snow a écrit :

So would it be worth the effort to identify each such place in the built-ins/stdlib and eventually change them all? I've seen support for doing so in other tracker issues and think it's a good idea personally.

I ran a few grep searches from the root of a recent hg tip:

  1. grep -n -r --include=.py --include=.c --exclude="topics.py" -E
    '.+\(.*\[[[:space:]]*,.*\].*\)' .

This looks for variants of "function(args [, opt])". There were 231
hits, I caught no false positives.

  1. grep -n -r --include=.py --include=.c --exclude="topics.py" -E
    '.+\(.*\[.*,[[:space:]]*\].*\)' .

As this pattern is valid Python syntax, I got mostly false positives,
but also a few interesting cases such as "range([start,] stop[, step])"
or "islice(seq, [start,] stop [, step])"

I'm afraid those last examples cannot be described with valid Python syntax.

@merwok
Copy link
Member

merwok commented Nov 15, 2011

C functions that have optional arguments but don't accept keyword arguments are a bit unusual,
and IIUC in most of the cases that's an implementation detail that could be removed.

So would it be worth the effort to identify each such place in the built-ins/stdlib and
eventually change them all? I've seen support for doing so in other tracker issues and think
it's a good idea personally.

Me too. (Can you give the #ids of these other issues?)

Probably, if this will bring some added value in addition to being easier to document.

I think we should fix C functions to accept kwargs for the sake of Python programmers, not merely to ease documentation (that would just be a nice side-effect :)

a few interesting cases such as "range([start,] stop[, step])"or "islice(seq, [start,] stop [, step])"
I'm afraid those last examples cannot be described with valid Python syntax.

Sphinx lets us give multiple signatures. I’ve just checked that this markup is valid and does not create duplicate index entries

.. function:: range(stop)
range(start, stop)
range(start, stop, step)

:)

@ezio-melotti
Copy link
Member Author

Me too. (Can you give the #ids of these other issues?)

See for example bpo-13012.

I think we should fix C functions to accept kwargs for the sake of
Python programmers, not merely to ease documentation (that would just
be a nice side-effect :)

And also for compatibility for other implementations like PyPy. I'm still not sure that is a good idea to do a mass conversion of all the functions though.

Sphinx lets us give multiple signatures. I’ve just checked that this
markup is valid and does not create duplicate index entries

This is something I was considering, but I'm afraid it might get too verbose (and introduce yet another convention). Sometimes this feature is also (mis?)used to group similar functions.

@merwok
Copy link
Member

merwok commented Nov 15, 2011

I think we should fix C functions to accept kwargs for the sake of Python programmers
And also for compatibility for other implementations like PyPy.

Good point.

I'm still not sure that is a good idea to do a mass conversion of all the functions though.

If there were only a handful of them it may be okay, but otherwise one issue per class or module sounds good.

Sphinx lets us give multiple signatures
This is something I was considering, but I'm afraid it might get too verbose

I find my example for range much more readable that the current markup with brackets.

(and introduce yet another convention).

I can live with this special case for the two or three functions that need it. It becomes moot if range gets fixed to support kwargs :)

Sometimes this feature is also (mis?)used to group similar functions.

IIUC it is the intended use case for the syntax, not a misuse: You tell Sphinx that you want link targets for these functions to end up here, and then you write doc. See for example the os docs: this syntax allows for nice grouping.

@ericsnowcurrently
Copy link
Member

Me too. (Can you give the #ids of these other issues?)

bpo-13012 is the one that I was thinking of (msg144328 specifically). However, I'm sure there was one more recently (which I can't find now).

@ericsnowcurrently
Copy link
Member

@msg147671

+1

@ericvsmith
Copy link
Member

I just ran across the other reason that having the actual default values documented is important. Sometimes I want to do this:

some_func(param if some_condition else <use the default value>)

If some_condition is False, I want the default behavior, if not, I want to pass in a parameter. If I don't know the real default value, I have to write:

if some_condition:
   some_func(param)
else:
   some_func()

@elibendersky
Copy link
Mannequin

elibendersky mannequin commented Nov 18, 2011

Eric,

Spot on :-)
This is *exactly* the reason that led me to open bpo-12875, which eventually led to this one.

@terryjreedy
Copy link
Member

From Ezio's original post: '''
If a function has optional arguments but it doesn't accept keyword arguments, the "func([arg1])" notation is used instead. ... The notation "func([arg=default])" should never be used, and "func([arg])" should be used only when keyword args are not accepted.
'''

In the following, I give objections to this PO (position only) rule and suggest an alternative ND (no default) rule: use 'name=default' when there is a default and '[name]' when there is not'.

The issue of whether an argument is required or optional is orthogonal to whether it can be passed by both position and name, only by name, or only by position. All combinations are possible. Optional arguments may or may not have a definition-time (or even run-time) default value, regardless of how passed. (In Python, use of *args and **kwds allows args to be optional without default.)

In the CPython stdlib, I think position-only arguments only occur with some, but only some, C functions. One can emulate such C functions in Python by doing the equivalent of what is going on with such C functions. Use a collective *varargs in the definition while naming the required and optional components of varargs in the doc as if they were the actual parameters. But I think we agree that emulating a limitation of C in Python is a bad idea.

So by using [] to mean both 'argument is optional' and 'function only take parameters by position' (or at least 'this parameter can only be passed to this function by position'), we are simultaneously documenting an intended and permanent feature of the Python function and a possibly temporary and unwanted side-effect of the current CPython implementation of that function. I think a separate PO indication might be better.

1: The PO rule goes against the effort to separate the Python language from the CPython implementation. With it, the doc for a function does not apply to other implementations that do not have the PO limitation for that function.

2: The PO rule is incomplete. It only marks an arg as position-only if it is optional, but not if it is required. And even if marking one arg as PO means that other args of the function might be, so 'watch out', there is still no special marking for a function with only required PO args.

A separate sentence like "For CPython, all args must be passed by position." would solve both of the above problems.

3: The PO rule does not account for the possibility that an argument can be passed by keyword, perhaps only by keyword, but have no default. This is possibly in Python with **kwds in the def and recognized optional names in the doc. With 'name=default' and '[name]' not allowed, how should such an argument be documented is a signature?

4: The PO rule omits useful information on defaults from the place of prominence - the signature header for the entry. Sometimes the information, needed by some users and all implementers, gets omitted altogether. For example, the doc string and manual entry give the signature for str.startwith as
str.startswith(prefix[, start[, end]])
The unmentioned defaults are None, None.

In summary, the PO rule primarily indicates, but only for optional args, whether the arg can be passed by keyword or not. It secondarily indicates, but only if it can be passed by keyword, what its default is. But if fails if the arg can be passed by keyword but does not have a default. It also fails, in its primary role, for required args.

To me, this is all mixed up. Method of passing is not related to optionality. What is special about optional args, regardless of how passed, is the default value, if it has one. The ND rule is to give exactly this information. With an implementation-independent signature and a separate note on passing method, when needed, it solves all the problems listed above. For .startswith, I would like to see something like
str.startswith(prefix, start=None, end=None) ...
CPython: pass args by position only.

---
bpo-13355 illustrates Eric's point with a twist.
"random.triangular(low, high, mode)
Return a random floating point number N such that low <= N <= high and with the specified mode between those bounds. The low and high bounds default to zero and one. The mode argument defaults to the midpoint between the bounds, giving a symmetric distribution."

The *actual* default for mode is None. The function *usually* acts if the default were as described. Twist 1 is that it does not actually calculate the midpoint, as it is not actually needed. Twist 2 is that there is currently a bug (easily fixed) such that triangular does not work if low>=high and mode is not specified, whereas it does work if the true default None is passed ;-). So one needs to know the real default to avoid the bug.

Of course, as I said on the issue, all defaults should be given in the signature (by either PO or ND rule):
"random.triangular(low=0.0, high=1.0, mode=None) ..."

And yes, +1 to documenting visible document conventions both in the documenting howto *and* in the docs themselves.

@baptistecarvello
Copy link
Mannequin

baptistecarvello mannequin commented Nov 18, 2011

Le 18/11/2011 05:29, Terry J. Reedy a écrit :

In the following, I give objections to this PO (position only) rule and suggest an alternative ND (no default) rule: use 'name=default' when there is a default and '[name]' when there is not'.

The issue of whether an argument is required or optional is orthogonal to whether it can be passed by both position and name, only by name, or only by position.

With this logic, you would need to use '[name=default]' when an argument
is optional *and* can be passed by name.

Sure, this notation is inherently redundant, but is has the advantage of
conveying both informations immediately to the user. It is also more
coherent with '[name]'.

But this is a big change from the current philosophy...

@ezio-melotti
Copy link
Member Author

From Ezio's original post: '''
If a function has optional arguments but it doesn't accept keyword
arguments, the "func([arg1])" notation is used instead. ... The
notation "func([arg=default])" should never be used, and "func([arg])"
should be used only when keyword args are not accepted.
'''

In the following, I give objections to this PO (position only) rule and > suggest an alternative ND (no default) rule: use 'name=default' when
there is a default and '[name]' when there is not'.

Maybe we should try to keep it simple and just document the signature of the function.
Everything that can not be described in the signature can be explained by words.

I tried to write down all the combinations of optional/non optional, with/without default, works/doesn't work with keywords to see how to represent them, but it started being a bit messy. The "problematic" combinations (for example a function that accepts an optional arguments with no default but that doesn't work with keywords) seem quite rare, and for them we could just write down what's special about them.

There are two more cases that could be solved with a specific notation though:

  1. optional arg, with default, doesn't work with keywords (e.g. range, startswith):
    func(arg1)
    func(arg1, arg2)
    *arg2* defaults to <default>.
  2. optional arg, with no default, that works only with keywords:
    func(arg1, *, arg2)

The keyword-only *, and the multiple signatures "tricks" can also be used for other similar cases.

func(arg1, **kwargs) can be used for functions that accept kwargs without expecting any specific value; if the values are known and have defaults they could be included in the signature (even if the default is like foo = kwargs.get('foo', default)).

This should cover most of the cases, it only uses valid Python syntax and avoids potentially confusing [].

@ezio-melotti ezio-melotti added the type-feature A feature request or enhancement label Sep 26, 2012
@vadmium
Copy link
Member

vadmium commented Apr 20, 2015

When a parameter is optional but does not have a simple default value, I suggest using some obviously invalid pseudocode, such as

function(arg1, arg2=<automatic, see text>)

See bpo-8706 about adding more support for keyword arguments. See also bpo-23738 for signatures that incorrectly appear to accept keywords due to including default values, and PEP-457’s slash (/) indicator for documenting positional-only parameters.

@rhettinger
Copy link
Contributor

Please don't add a new notation that makes the docs less readable than they are now. For the most part, the existing docs have done a great job communicating how to use our functions. Please don't undo 20 years of tradition because it bugs you.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@AA-Turner
Copy link
Member

I think we ought to move this to the devguide repo; any change here would be against the style guide.

A

@AA-Turner AA-Turner added the pending The issue will be closed if no feedback is provided label Sep 22, 2023
@terryjreedy
Copy link
Member

This is out of date here.

@terryjreedy terryjreedy closed this as not planned Won't fix, can't repro, duplicate, stale Dec 13, 2023
@terryjreedy terryjreedy removed the pending The issue will be closed if no feedback is provided label Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

8 participants