This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rami
Recipients docs@python, rami
Date 2017-08-24.16:03:32
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1503590613.37.0.607996999117.issue31270@psf.upfronthosting.co.za>
In-reply-to
Content
The documentation given for itertools.zip_longest contains a "roughly equivalent" pure-python implementation of the function that is intended to help the user understand what zip_longest does on a functional level.

However, the given implementation is very complicated to read for newcomers and experienced Python programmers alike, as it uses a custom-defined exception for control flow handling, a nested function, a condition that always is true if any arguments are passed ("while iterators"), as well as two other non-trivial functions from itertools (chain and repeat).

For future reference, this is the currently given implementation:

    def zip_longest(*args, **kwds):
        # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
        fillvalue = kwds.get('fillvalue')
        iterators = [iter(it) for it in args]

        while True:
            exhausted = 0
            values = []

            for it in iterators:
                try:
                    values.append(next(it))
                except StopIteration:
                    values.append(fillvalue)
                    exhausted += 1

            if exhausted < len(args):
                yield tuple(values)
            else:
                break

This is way more complex than necessary to teach the concept of zip_longest. With this issue, I will submit a pull request with a new example implementation that seems to be the same level of "roughly equivalent" but is much easier to read, since it only uses two loops and now complicated flow 

    def zip_longest(*args, **kwds):
        # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
        fillvalue = kwds.get('fillvalue')
        iterators = [iter(it) for it in args]

        while True:
            exhausted = 0
            values = []

            for it in iterators:
                try:
                    values.append(next(it))
                except StopIteration:
                    values.append(fillvalue)
                    exhausted += 1

            if exhausted < len(args):
                yield tuple(values)
            else:
                break


Looking at the C code of the actual implementation, I don't see that any one of the two implementations is obviously "more equivalent". I'm unsure about performance -- I haven't tried them on that but I don't think that's the point of this learning implementation.

I ran all tests from Lib/test/test_itertools.py against both the old and the new implementation. The new implementation fails at 3 tests, while the old implementation failed at four. Two of the remaining failures are related to TypeErrors not being thrown on invalid input, one of them is related to pickling the resulting object. I believe all three of them are fine to ignore in this sample, as it is not relevant to the documentation purpose.

Therefore, I believe the documentation should be changed like suggested. I'd be happy for any feedback or further ideas to improve its readability!
History
Date User Action Args
2017-08-24 16:03:33ramisetrecipients: + rami, docs@python
2017-08-24 16:03:33ramisetmessageid: <1503590613.37.0.607996999117.issue31270@psf.upfronthosting.co.za>
2017-08-24 16:03:33ramilinkissue31270 messages
2017-08-24 16:03:32ramicreate