New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Argument Clinic should use a non-error-prone syntax to mark text signatures #64525
Comments
Sorry this is so long--but I wanted to make my point. Here's the tl;dr summary. The problem: The syntax used for Argument-Clinic-generated text I think we should change the syntax to something people would "*(" -- A quick recap on how signature information for builtins works. The builtin's docstring contains the signature, encoded as text using inspect.signature then retrieves __text_signature__, and if ast.parse() In bpo-19674 there was some discussion on what this syntax should be. functionname(args, etc)\n He felt it was a good choice, and pointed out that Sphinx autodoc __doc__ and __text_signature_ aren't very smart about detecting -- First, consider: this signature syntax is the convention docstrings Great!, you might think, we get signatures for free, even on functions The problem is, many of these pseudo-signatures aren't proper Python. "lstat(path, *, dir_fd=None) -> stat result\n" This line passes the "is it a text signature test?", so __doc__ That's bad enough. Now consider the first *two* lines of the "open(file, mode='r', buffering=-1, encoding=None,\n" __doc__ clips the first line but retains the second. pydoc prints Both of these problems go away if I add one more check to the "_accept() -> (integer, address info)\n" Okay, so __doc__ and __text_signature__ could count parentheses And there would *still* be handwritten docstrings that would pass "insort_right(a, x[, lo[, hi]])\n" We could only be *certain* if we gave up on having two parsers. Okay, so we could attack the problem from the other end. Clean Therefore I strongly suggest we switch to a syntax that nobody will Have I convinced you? |
You have convinced me. |
Larry Hastings <report@bugs.python.org> wrote:
I like the original "def (...)\n" approach from bpo-19674. If that is not |
What if the __text_signature__ and __doc__ getter will call ast.parse() (actually compile()) on signature candidate? If it fails, then builtin has no signature, the __text_signature__ getter returns '', and the __doc__ getter returns all original docstring. |
Serhiy: I'm going to add PEP-457 features to the text signature, because I wasn't seriously proposing that __doc__ and __text_signature__ parse the string. That would obviously be a waste of CPU. I was simply taking the argument to its logical extreme. |
Right, at the very least we want to handle positional only arguments (since PEP-362 also handles those). My one concern with using "def " as the prefix is the fact it's not actually Python syntax. How do you feel about using "sig: " as the prefix? That would still read sensibly even if you somehow got hold of the unaltered C level docstring rather than going through the properties. |
+1 for "sig: ". |
The fact that "def " isn't Python syntax is, if anything, a mild point in its favor. I don't want anyone hitting on this by accident. "sig:" isn't Python syntax either, and yet you yourself propose it ;-) How about "sig="? |
That wasn't quite what I meant. "def (a, b, c)" *looks* like Python syntax (aside from the missing function name), but "def (a, b, c, /)" does not. So I consider "def " a misleading prefix. By contrast, neither of these looks like it is trying to be a valid function header, while still hinting strongly that it is signature related: "sig: (a, b, c)" I would also be fine with "sig=" (since humans shouldn't be reading this regardless): "sig=(a, b, c)" |
Here's a first cut at a patch. All signatures now start with "sig=(". I also added a special marker: if the first parameter starts with "$", we know for certain it's a "self" (or "module" or "type") parameter. This means we can lose the heuristics for "do we have a self parameter?", making inspect.Signature a little more bullet-proof. "$" was chosen as it isn't a legal token in Python. |
I like the "sig=" and the "$". There seems to be a small glitch in __rdivmod__: help(int.__rdivmod__)
__rdivmod__(<self>, value)
sig=($self, value)
Returns divmod(value, self). The sig line is shown (and the preferred form is the imperative "Return divmod"). |
Stefan actually picked up on an existing bug there, that this patch just changes the spelling of: >>> int.__rdivmod__.__doc__
'__rdivmod__(self, value)\nReturns divmod(value, self).'
>>> int.__rdivmod__.__text_signature__
'(self, value)' When reviewing Larry's typeobject.c patch, both Guido and I missed the fact that some of the "slot" macros have implicit lines in their docstrings if the signature is common across all instances of that slot: #define UNSLOT(NAME, SLOT, FUNCTION, WRAPPER, DOC) \
ETSLOT(NAME, as_number.SLOT, FUNCTION, WRAPPER, \
NAME "(self)\n" DOC)
#define IBSLOT(NAME, SLOT, FUNCTION, WRAPPER, DOC) \
ETSLOT(NAME, as_number.SLOT, FUNCTION, WRAPPER, \
NAME "(self, value)\nReturns self" DOC "value.")
#define BINSLOT(NAME, SLOT, FUNCTION, DOC) \
ETSLOT(NAME, as_number.SLOT, FUNCTION, wrap_binaryfunc_l, \
NAME "(self, value)\nReturns self" DOC "value.")
#define RBINSLOT(NAME, SLOT, FUNCTION, DOC) \
ETSLOT(NAME, as_number.SLOT, FUNCTION, wrap_binaryfunc_r, \
NAME "(self, value)\nReturns value" DOC "self.")
#define BINSLOTNOTINFIX(NAME, SLOT, FUNCTION, DOC) \
ETSLOT(NAME, as_number.SLOT, FUNCTION, wrap_binaryfunc_l, \
NAME "(self, value)\n" DOC)
#define RBINSLOTNOTINFIX(NAME, SLOT, FUNCTION, DOC) \
ETSLOT(NAME, as_number.SLOT, FUNCTION, wrap_binaryfunc_r, \
NAME "(self, value)\n" DOC) For those, we need to change the macro and then remove the redundant info from the individual slot definitions. |
Aside from that glitch, the new scheme looks good to me. I also like the fact it allows the text signature to be included in the docstring normally if inspect.Signature doesn't support it, so help(f) can still be generated automatically. |
Nick: you lost me. Change the macro how? What is the redundant info? I already changed those macros so they generate signatures, both currently in trunk and in my patch. |
Stefan: I made the change. I left "Implements <something>" alone as "Implement <something>" sounded wrong, but all the rest dropped their s's. |
{"__rdivmod__", __builtin_offsetof (PyHeapTypeObject, as_number.nb_divmod), (void *)(slot_nb_divmod), wrap_binaryfunc_r, "sig=($self, value)\n" "sig=($self, value)\nReturns divmod(value, self)."} There are two "sig" instances. What Nick said is that the situation was pretty {"__rdivmod__", __builtin_offsetof (PyHeapTypeObject, as_number.nb_divmod), (void *)(slot_nb_divmod), wrap_binaryfunc_r, "__rdivmod__" "(self, value)\n" "__rdivmod__(self, value)\nReturns divmod(value, self)."} |
Right, the macros I quoted all include the signature automatically |
Attached is a second patch.
|
I'm surprised it made a review link. It didn't apply cleanly for me here. While merging I noticed that the imperative declension fix had snuck out of the diff somehow. So I redid that. Attached is an updated patch. Also I should mention: clinic.py currently accepts both the old and new comment format. I'll leave support for the old one in until just before the last release candidate. |
Looks good to me :) I also like the fact it simplifies the internal APIs by making it really trivial to detect the presence of a clinic signature from C. |
New changeset d6311829da15 by Larry Hastings in branch 'default': |
Yeah. I did a pretty terrible job of articulating why the "<fn_name>(" signature was a bad idea in the first place ;-) |
I stumble over this because I had already adapted our doctests in Cython to the changed Py3.4 behaviour, so they started failing now because the automatic signature extraction was essentially reverted in CPython. I had started to consider it a feature that CPython 3.4 was finally smart enough to pick up signatures from docstrings, at least for documentation purposes, much the same way that tools like epydoc or Sphinx do it. Cython has a feature to embed signatures for that reason, and so far, they happily ended up in "__text_signature__" in Py3.4. I understand the problem that not all of these signatures are valid Python signatures. What Cython currently generates certainly isn't. The new "sig=" is not supported by any of the existing documentation tools. Having Cython follow here would mean that they would no longer be able to read the signatures, and it's clearly more important for the time being to keep *them* working nicely. This change actually facilitates that, because it leaves the embedded signatures untouched, so that these tools can normally pick them up again. So I agree that the reverted behaviour is in fact better than what Py3.4 previously did. FWIW, I think the best way to go forward would be to try to find a way to map Cython's C signatures directly to a reasonable version of a "__signature__" object. This hasn't appeared entirely trivial so far, but my guess is that the recent requirements on and improvements to the argument clinic should also have made this mapping a little less hard, because there are at least a bit of an infrastructure and some precedents around. Still, Cython's coverage of C/C++ types (also in signatures) is hugely broader than what ac would ever want to support, so we'll have to see what stumbling blocks remain on that road. |
I tried this in Cython and ISTM that the C level parser is a bit too forgiving: def sig(a, b):
"""sig=(a*b)"""
return a * b |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: