classification
Title: add type signatures to library function docs
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.9
process
Status: closed Resolution: postponed
Dependencies: Superseder:
Assigned To: Nosy List: docs@python, gvanrossum, phr, rhettinger, veky
Priority: normal Keywords:

Created on 2019-10-01 04:21 by phr, last changed 2019-10-03 14:40 by gvanrossum. This issue is now closed.

Messages (15)
msg353634 - (view) Author: paul rubin (phr) Date: 2019-10-01 04:21
It would be nice if the library reference manual had type signatures for all the stdlib functions at some point.  It might be possible to extract a lot of them automatically from typeshed and semi-automatically paste them into the doc files.  It might also be ok to do this gradually.  I can help with this but wouldn't want to take on the entire task.
msg353635 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-10-01 04:55
I believe there was an explicit decision not to do this.  For the time being, optional typing is still optional.
msg353636 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-10-01 05:00
Even if this not desirable for the source code, it still makes sense to add the types to the docs, at least when they are simple and compact. The proposed notation from PEP 604 ('|' for unions) might help with compactness. Many docs are unnecessarily vague about types.
msg353638 - (view) Author: Vedran Čačić (veky) * Date: 2019-10-01 06:42
> Many docs are ... vague about types.

... and I consider that a feature. At least if you do that, make an explicit decision not to introduce TypeErrors for "disagreeing with the documented signature".

For example, I'd be ok with sum being documented as taking an iterable of numbers and returning a number, but it would be a catastrophe if as a consequence of that, sum of a list of lists would stop working.
msg353706 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-10-01 17:55
I'm sorry, but that's unacceptable.

On Mon, Sep 30, 2019 at 23:42 Vedran Čačić <report@bugs.python.org> wrote:

>
> Vedran Čačić <vedgar@gmail.com> added the comment:
>
> > Many docs are ... vague about types.
>
> ... and I consider that a feature. At least if you do that, make an
> explicit decision not to introduce TypeErrors for "disagreeing with the
> documented signature".
>
> For example, I'd be ok with sum being documented as taking an iterable of
> numbers and returning a number, but it would be a catastrophe if as a
> consequence of that, sum of a list of lists would stop working.
>
> ----------
> nosy: +veky
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue38333>
> _______________________________________
>
-- 
--Guido (mobile)
msg353776 - (view) Author: paul rubin (phr) Date: 2019-10-02 21:42
Yes, the suggestion was just for the docs, and since those are intended for human rather than machine consumption, it's fine if there are some blurry cases where there is no signature.  Ideally in those cases, the issue should be explained in the doc text.

I actually don't see what's wrong with including signatures in the source code as well, as long as doing so doesn't break anyone's existing code.  I agree with Veky that one should be very hesitant about breaking existing working code, even if that code relies on undocumented behavior.
msg353778 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-10-02 22:13
@phr

To be clear, I agree that there's nothing wrong with adding signatures to docs. We just need to find a way to do it. There will definitely be some cases where it's better not to have a type rather than trying to spell out the actual type in the docs.

My "unacceptable" comment was meant in response to Vedran's suggestion that it would be okay to lie in the docs about the signature for sum(). If the truth is too subtle to use a specific type signature we should keep the words. (The words for sum() are actually pretty clear.)

FWIW: My objection against vague docs was specifically about situations where the word "string" is used without clarifying if this allows bytes. I've also seen docs that were even more vague, e.g. "a name" or "a filename".

Signatures in the code won't "break" the code (they are ignored at runtime) but if present they should nevertheless be precise since they will be used by type checkers. Signatures in code are *not* just documentation. Only in very limited situations would I be okay with lies in signatures -- this would have to be done on a case by case basis.
msg353807 - (view) Author: Vedran Čačić (veky) * Date: 2019-10-03 03:02
In that case, I'm pretty sure you'd never be able to document almost _any_ function signature. Python is simply not a statically typed language, and we love it because of that.

Ok, go to the list of builtins, and start alphabetically. First is abs. What type does it take, and what type does it return? Again, I'd be completely ok with saying it takes an int, a float, or a complex, and returns either an int or a float. The same as the words in the docs already say. But according to Guido, that's "unacceptable", since abs can also take (and return) a datetime.timedelta, for example.

I am very afraid that if we start doing this, we will lose _many_ useful features that make Python the language it is. It's really not worth it.
msg353811 - (view) Author: paul rubin (phr) Date: 2019-10-03 03:38
abs takes any value that understands the __abs__ method and returns something of the same type.  In fact there is already a type protocol for it:

https://mypy.readthedocs.io/en/stable/protocols.html#supportsabs-t

So abs's signature would be (x : Abs[T]) -> T where T is a type parameter.  

I'm sure there are some examples where no good signature is possible, but lots of others are fine.  Someone did a Smalltalk study long ago and found that most functions were monomorphic in practice even though Smalltalk is dynamically typed like Python.  As a matter of style, Python code tends to be typed even when it doesn't have to be.  Not all the time of course.

I'm still getting used to types and mypy (I was a py2 holdout til quite recently, and mypy has been a more attractive reason to change than any of the other stuff) and I do keep noticing cases that don't work as I hoped, but it's still a good move in general.
msg353815 - (view) Author: Vedran Čačić (veky) * Date: 2019-10-03 06:20
Well, yes, if you're going to invent a special typeclass for every protocol, then you can document any signature. But what purpose does it serve? Abs to me seems like a hack, not something we really wanted to capture with the type system.

Do you find (x : Abs[T]) -> T in any way clearer than what's currently written in the docs? Do we really want to move in that direction? And not to mention that "... returns something of the same type" is _still_ incorrect -- for example, for complex it returns float.
msg353817 - (view) Author: paul rubin (phr) Date: 2019-10-03 06:41
At first glance, having a typeclass for each protocol (at least the widely used ones) seems fine.  It's inherent in Haskell and a lot of libraries are organized around a common set of typeclasses--look up "Typeclassopedia" for descriptions of them.  Certainly the case of abs (which you asked about by name), the typeclass is already there in the typing module.

You're right about abs(complex) returning float.  Haskell's type signature for abs is "Num a => a -> a" which means the input type is the same as the output.  That is a little bit peculiar since the abs of a complex number is complex, but we usually think of abs as a mathematical norm, which is conventionally a real.

Anyway, "abs(x: Abs[T]) -> Any"  is a better-than-nothing signature for abs, and the docs can comment a little further, and maybe someday it could even do a type-level lookup at typechecking time, i.e. have something like Haskell's "type families".

I like to think it's possible to supply reasonable signatures for most functions.  I just fixed a bug in something today because bs4 (beautiful soup 4) has no typeshed stub so mypy uses Any for functions like soup.find, instead of Optional[tag].  So the program worked fine as long as find kept returning a tag, but then crashed because it hit a document without a tag and my code didn't check for None.  That's something more precise types would have caught at mypy time.
msg353818 - (view) Author: Vedran Čačić (veky) * Date: 2019-10-03 06:48
Your arguments in my view boil down to "Haskell is a nice language" and "Static typing is useful". It still doesn't change my argument that Python is genetically neither of these. And we will cripple it greatly if we try to push it in that direction.

[Your problem with bs4 is just a facet of a billion dollar mistake (treating null as an object), that's present in many statically typed languages even today. Optional is great for explaining to humans, but very hard to get right when talking to computers.]
msg353819 - (view) Author: paul rubin (phr) Date: 2019-10-03 06:58
I don't think we're going to accomplish anything continuing the eternal static-vs-dynamic debate, which in Python's case has already been resolved by adding optional static typing.  It's a done deal, and the issue here is just how to document it.  Erlang (via the Erlang Dialyzer), Clojure, and Racket have all been down a similar road and gained some value from it.

Haskell's Maybe type (its version of Option) works fine.  In Python the convention of returning None for a missing value is not great, but we are stuck with it and Option[whatever] is helpful.
msg353847 - (view) Author: Vedran Čačić (veky) * Date: 2019-10-03 13:20
https://www.python.org/dev/peps/pep-0484/#non-goals

I really have nothing more to say.
msg353851 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2019-10-03 14:40
Since this discussion went down in flames quickly, I'm closing it. Another issue can be opened once someone has a concrete design of what this should look like in the formatted docs, an example that does this for a few libraries, and a prototype of a tool that extracts signatures from typeshed.
History
Date User Action Args
2019-10-03 14:40:44gvanrossumsetstatus: open -> closed
resolution: postponed
stage: resolved
2019-10-03 14:40:28gvanrossumsetmessages: + msg353851
2019-10-03 13:20:11vekysetmessages: + msg353847
2019-10-03 06:58:07phrsetmessages: + msg353819
2019-10-03 06:48:42vekysetmessages: + msg353818
2019-10-03 06:41:52phrsetmessages: + msg353817
2019-10-03 06:20:42vekysetmessages: + msg353815
2019-10-03 03:38:34phrsetmessages: + msg353811
2019-10-03 03:02:46vekysetmessages: + msg353807
2019-10-02 22:13:00gvanrossumsetmessages: + msg353778
2019-10-02 21:42:35phrsetmessages: + msg353776
2019-10-01 17:55:25gvanrossumsetmessages: + msg353706
2019-10-01 06:42:22vekysetnosy: + veky
messages: + msg353638
2019-10-01 05:00:06gvanrossumsetassignee: gvanrossum ->
messages: + msg353636
2019-10-01 04:55:46rhettingersetassignee: docs@python -> gvanrossum

nosy: + gvanrossum
2019-10-01 04:55:18rhettingersetnosy: + rhettinger
messages: + msg353635
2019-10-01 04:21:40phrcreate