classification
Title: Confirm the types of parameters of traceback.format_list and traceback.StackSummary.from_list post-3.5
Type: behavior Stage:
Components: Documentation Versions: Python 3.8, Python 3.7, Python 3.6, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Jelle Zijlstra, Nathaniel Manista, berker.peksag, docs@python, srittau
Priority: normal Keywords:

Created on 2018-09-12 20:03 by Nathaniel Manista, last changed 2018-09-24 15:41 by Jelle Zijlstra.

Messages (7)
msg325175 - (view) Author: Nathaniel Manista (Nathaniel Manista) Date: 2018-09-12 20:03
So I'm fixing a bug in typeshed's accounting of the traceback module (https://github.com/python/typeshed/pull/2436) and the documented semantics of traceback.format_list don't quite smell to me what I think they might be intended to be:

1) I know it has the name "format_list", but is it really intended to require a list? Why not a sequence, or a collection, or an iterable? I would think it would be fine to pass an iterable to traceback.format_list. Is it fine?

2) What is the desired component type for the aggregate passed to format_list? In 3.4-and-earlier it was Tuple[str, int, str, Optional[str]], and that still works in 3.5-through-3.8, but is that just backwards compatibility or is that something that users of traceback.format_list should feel encouraged to continue passing into the future?

Sorry for filing a bug just to ask "huh; really?" but... please confirm?
msg325176 - (view) Author: Nathaniel Manista (Nathaniel Manista) Date: 2018-09-12 20:29
... and while we're here, how about StackSummary.from_list's "a_list" parameter as well:

3) Can it be Iterable? It looks like it can be Iterable? Is it fine for it to be Iterable?

4) Should the component type of "a_list" be FrameSummary? Is the support for Tuple[str, int, str, Optional[str]] merely backward-looking, or is new code encouraged to pass Tuple[str, int, str, Optional[str]] objects?
msg325313 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2018-09-14 00:14
> 1) I know it has the name "format_list", but is it really intended to require a
> list? Why not a sequence, or a collection, or an iterable? I would think it would
> be fine to pass an iterable to traceback.format_list. Is it fine?

In 3.4, format_list() was documented to accept return values of extract_tb() and extract_stack() functions and they both were returned lists:

    def extract_tb(tb, limit=None):
        return list(_extract_tb_iter(tb, limit=limit))

    def extract_stack(f=None, limit=None):
        stack = list(_extract_stack_iter(_get_stack(f), limit=limit))
        stack.reverse()
        return stack

I don't think we support the "pass manually created iterables" case here.

> 2) What is the desired component type for the aggregate passed to
> format_list? In 3.4-and-earlier it was Tuple[str, int, str,
> Optional[str]], and that still works in 3.5-through-3.8, but is that
> just backwards compatibility [...]

Yes, the old API is still supported for backwards compatibility reasons.
msg325895 - (view) Author: Nathaniel Manista (Nathaniel Manista) Date: 2018-09-20 16:09
> In 3.4, format_list() was documented to accept return values of extract_tb() > and extract_stack() functions and they both were returned lists:
> 
>     def extract_tb(tb, limit=None):
>         return list(_extract_tb_iter(tb, limit=limit))
> 
>     def extract_stack(f=None, limit=None):
>         stack = list(_extract_stack_iter(_get_stack(f), limit=limit))
>         stack.reverse()
>         return stack

I think that’s an unnecessarily and extraordinarily narrow reading - you could use the same reading to make a judgement of “format_list may only be passed values returned by extract_tb and extract_stack rather than other values of the same type”, and that would be absurd.

I don’t think it’s a wise choice to say “because the motivation for offering the format_list function was to work with values returned by extract_tb and extract_stack, the allowed inputs to format_list should be restricted to just those types”.

Why not support Iterable[FrameSummary]? A choice to support Iterable[FrameSummary] *is also a choice to support List[FrameSummary]*. It's just a "wider" choice; it’s not a breaking choice to drop support for List[FrameSummary].

> I don't think we support the "pass manually created iterables" case here.

What’s a “manually created iterable”? What kinds of iterables aren’t “manually created iterables”? Can their differences be captured in the type system?

> Yes, the old API is still supported for backwards compatibility reasons.

This is a very flat statement that invites questions by what it doesn’t say and by the way it describes something that is true today without saying for how long into the future that thing will remain true. For how long will the old API be supported for backwards compatibility reasons? Is new code encouraged to use the new API? If the new API is better, shouldn’t new code use it? If the new API isn’t better, why was it introduced?
msg325911 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2018-09-20 17:16
> Why not support Iterable[FrameSummary]?

The question that needs to be answered here is "why we should support Iterable[FrameSummary]?" and you're one the one who needs to answer it instead of saying our design decisions were "absurd" and "not wise", without giving any concrete use cases for your "wider" design choice.

Initial docstring for format_list() were added in 2000: https://github.com/python/cpython/commit/e7b146fb3bdca62a0d5ecc06dbf3348e5a4fe757#diff-e57ff53a569d0ebbe201ad0c102ee27e

    Given a list of tuples as returned by extract_tb() or
    extract_stack(), return a list of strings ready for printing.

There are no tests for iterables other than list at https://github.com/python/cpython/blob/master/Lib/test/test_traceback.py 

So we won't change documentation and docstrings, add more tests without seeing concrete use cases.

> What’s a “manually created iterable”?

I meant an iterable that is extracted from a traceback object manually without using extract_tb() or extract_stack() functions (by using a custom function, an external dependency, or HTTP API) How people can get Iterable[FrameSummary] as an input and pass it to format_list()?

> For how long will the old API be supported for backwards
> compatibility reasons?

I don't remember any plans to remove the support for the old API. And I'm pretty sure we won't do anything until 2.7 is out of maintenance.

> Is new code encouraged to use the new API? If the new API is better,
> shouldn’t new code use it? If the new API isn’t better, why was it
> introduced?

Are you serious? No, we've introduced the new API and spent weeks designing it just to play with people. They should stick with the old one for no reason.
msg326241 - (view) Author: Nathaniel Manista (Nathaniel Manista) Date: 2018-09-24 14:32
> The question that needs to be answered here is "why we should support
> Iterable[FrameSummary]?" and you're one the one who needs to answer
> it

Okay, here are a few reasons:
1) A function that requires that an input be a List invites the question “why is an Iterable, or a Collection, or a Sequence, not sufficient?”. Of course the best way to handle that question is to widen the API to accept at least a Sequence so that the question isn’t even asked, but the next-best way to handle the question is with an answer like “because the function mutates the given List”. But that’s not the answer to the question here (which is a big part of why I opened the bug, asking the question more directly). So far the answer seems to be “because we wanted to be conservative in our API design” and “because we didn’t want to support what we weren’t testing”. Those are good answers, but I’d like to persuade you that they are no longer good enough.
2) Because List is invariant in element type whereas Sequence, Collection, and Iterable are covariant (I'm nearly certain) in element type, format_list is hard to use with type-checked code (see the discussion in https://github.com/python/typeshed/pull/2436 if you’re not following it already). Should the typeshed-canonicalized type of format_list’s “extracted_list” parameter be “List[FrameSummary]”? If so, then given a subclass MyFrameSummarySubclass of FrameSummary, List[MyFrameSummarySubclass] would *not* be valid to pass to format_list. That seems like a poor developer experience, right? If the type of “extracted_list” were widened to Sequence[FrameSummary], Collection[FrameSummary], or (ideally) Iterable[FrameSummary], then an expression of type Iterable[MyFrameSummarySubclass] would be perfectly fine to pass for “extracted_list”.
3) I tried to pass an Iterable[FrameSummary] to traceback.format_list in my own code. My situation is not itself open source but is a bit like this:

    def _FormatFrames(frame) -> str:
        """Generate a traceback of the current position of the thread in the code."""
        frame_tuples = []
        while frame:
            filename = <censored>
            lineno = frame.f_lineno
            line = <censored>
            frame_tuples.append((filename, lineno, frame.f_code.co_name, line,))
            frame = frame.f_back
        formatted_frame_strings = traceback.format_list(
            traceback.FrameSummary(file_name, line_number, frame_name, line=line)
            for file_name, line_number, frame_name, line in frame_tuples)
        return ''.join(formatted_frame_strings)

.

> instead of saying our design decisions were "absurd" and "not wise",
> without giving any concrete use cases for your "wider" design choice.

According to what we’ve learned over the course of the life and use of Python, and according to what we understand as best practices in the year 2018: I believe it is absolutely unwise to have offer an API that requires that a given parameter be a List without documenting the reasons sustaining that extremely limited design. Why won’t a Sequence suffice? Heck, why won’t some other implementation of MutableSequence suffice? Notice that I opened this bug with the question “wait, really?” rather than the statement “this is wrong and must be changed” - it might be the case that List is the best choice for the type of the parameter (though I now doubt it even more) but certainly it’s clear that *the documentation does not establish or communicate why a List is required*.

> Initial docstring for format_list() were added in 2000:
> https://github.com/python/cpython/commit/e7b146fb3bdc#diff-e57ff53a569d0ebbe201ad0c102ee27e
>
>     Given a list of tuples as returned by extract_tb() or
>     extract_stack(), return a list of strings ready for printing.

I have no doubt that that’s what was written in 2000, when we didn’t know nearly as much as we do today and when what were considered best practices were radically different. Now we know more - particularly for this case “never ask for a List where asking for a MutableSequence will do, and never ask for a MutableSequence where asking for a Sequence will do, and never ask for a Sequence where asking for a Collection will do, and never ask for a Collection where asking for an Iterable will do”. The big question for format_list is “will an Iterable do?”. If not, that’s strange and unique by today’s standards and practices and the function’s documentation should at least give some enlightenment as to why it is the case.

> There are no tests for iterables other than list at
> https://github.com/python/cpython/blob/master/Lib/test/test_traceback.py 

I’d be happy to add some… except test_traceback.py doesn’t appear to contain any direct testing of format_list, and I can’t find “the format_list tests” to which it alludes (https://github.com/python/cpython/search?q=format_list&unscoped_q=format_list). Is there any direct call to traceback.format_list under Lib/test? Where should tests of passing an Iterable to format_list be added?

> So we won't change documentation and docstrings, add more tests without seeing concrete
> use cases.

I wouldn’t be here if I were personally affected in my own use case, and as more and more users start statically verifying their code with mypy and pytype, more and more of them will be affected. This problem is worth solving.

> I meant an iterable that is extracted from a traceback object manually without
> using extract_tb() or extract_stack() functions (by using a custom function, an
> external dependency, or HTTP API) How people can get
> Iterable[FrameSummary] as an input and pass it to format_list()?

FrameSummary is a public class with a public constructor. Anyone can construct instances of it for any reason and aggregate them in any Iterable they please; the question before us is whether or not format_list should accept any such Iterable or should insist that callers re-aggregate their FrameSummarys in a List. I’ve only described here the only use case that I happen personally to have encountered, but I think it’s worth bearing in mind that *public code elements invite public use*, so we shouldn’t be surprised when FrameSummary and format_list get used for purposes for which they appear well-suited, but which weren’t necessarily envisioned.

> Are you serious?

Yes.

> No, we've introduced the new API and spent weeks designing it just
> to play with people. They should stick with the old one for no reason.

Thank you for this very strong affirmation, which is more constructive than I suspect you meant it to be. Note that the documentation of format_list doesn’t mention a preference for passing FrameSummarys over passing Tuples, but that may simply be a consequence of not writing the documentation with passing of caller-constructed-values-that-were-not-returned-by-extract_tb-or-extract_stack in mind. Over in https://github.com/python/typeshed/pull/2436 there’s some back-and-forth over what the *element type* of “extracted_list” should be, and I’m in favor of going the extra mile and making it FrameSummary in 3.5-and-later rather than Union[Tuple[str, int, str, Optional[str]], FrameSummary]. It’s reasonable to say that if an author is being diligent enough to type-check their code, we can presume that they are diligent enough to use an API according to current best practices rather than making use of a backwards compatibility carve-out, right? It sounds like you might feel similarly?
msg326254 - (view) Author: Jelle Zijlstra (Jelle Zijlstra) * (Python triager) Date: 2018-09-24 15:41
> How people can get Iterable[FrameSummary] as an input and pass it to format_list()?

If I want to format a traceback, but omit traceback lines that refer to a particular module (for example, code for a coroutine runner), I could write `format_list(filter(lambda fs: 'asyncio' not in fs.filename, extract_stack())`.

I would also favor broadening the documented argument type for `format_list()` to say it accepts an iterable of FrameSummaries. No code change is necessary and it makes the code more flexible.
History
Date User Action Args
2018-09-24 15:41:06Jelle Zijlstrasetnosy: + Jelle Zijlstra
messages: + msg326254
2018-09-24 14:32:24Nathaniel Manistasetmessages: + msg326241
2018-09-20 17:16:37berker.peksagsetmessages: + msg325911
2018-09-20 16:09:59Nathaniel Manistasetmessages: + msg325895
2018-09-14 00:14:28berker.peksagsetnosy: + berker.peksag
messages: + msg325313
2018-09-13 00:00:59srittausetnosy: + srittau
2018-09-12 20:29:15Nathaniel Manistasetmessages: + msg325176
title: Confirm the type of traceback.format_list post-3.5 -> Confirm the types of parameters of traceback.format_list and traceback.StackSummary.from_list post-3.5
2018-09-12 20:03:24Nathaniel Manistacreate