classification
Title: Add an optional "strict" check to zip
Type: enhancement Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Provide a strict form of zip (PEP-618) requiring same length inputs
View: 40636
Assigned To: brandtbucher Nosy List: brandtbucher, cool-RR, serhiy.storchaka, steven.daprano, vstinner
Priority: normal Keywords:

Created on 2020-04-21 16:32 by brandtbucher, last changed 2020-06-17 14:51 by brandtbucher. This issue is now closed.

Messages (8)
msg366926 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-04-21 16:32
As discussed on Python-ideas:

https://mail.python.org/archives/list/python-ideas@python.org/thread/6GFUADSQ5JTF7W7OGWF7XF2NH2XUTUQM/

When a keyword-only argument "strict=True" is passed to zip's constructor, a ValueError will be raised in the case where one iterator is exhausted before the others. Otherwise, no side-effects (such as iterator consumption) will be changed.

I do wonder if we can use a better keyword than "strict" here.

I'm currently working on an implementation, and @cool-RR is working on tests and docs.
msg366927 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-21 16:35
It would be better to implement it as a separate function.
msg366928 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-04-21 16:36
Also you can just get a ready implementation from more-tertools.
msg366930 - (view) Author: Ram Rachum (cool-RR) * Date: 2020-04-21 16:44
Here are the tests I made: https://github.com/cool-RR/cpython/commit/766409748a107f290997b0cfab5aa19d0c2888e5
msg366946 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-04-21 22:05
Slight edit: if the shortest iterator is "first", one additional item will have to be drawn from the next non-exhausted iterator. I missed that, initially.

> It would be better to implement it as a separate function.

I disagree. It's not intrusive here; I think the handful of new lines this needs to fit into the existing zip implementation (which already handles most of this logic) is more maintainable than a whole new object/function living somewhere else.

> Also you can just get a ready implementation from more-tertools.

I've found that there are several places where our own standard library could immediately benefit from this change.
msg366948 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2020-04-21 22:55
I don't think this is needed in the builtin zip at all. I think that there is no consensus on Python-Ideas that this is needed or desirable.

I especially don't think the API should be a keyword flag on zip. Flag arguments which change the behaviour of functions are at best a code-smell and at worst an outright anti-pattern.

It is not always practical to avoid them, but in this case it certainly is: if we need this (I'm not sure we do) then a separate zip_strict() function in itertools next to zip_longest() is better than a flag on the builtin zip.

(That's not to say that the zip_strict iterator must be an independent class to the builtin zip and itertools.zip_longest, they can share a common backend. It is the public API I am referring to.)

I've already posted an implementation on the mailing list, it is about half a dozen or so lines of Python. Another independent implementation is available from the current development branch of more-itertools, more or less the same only with a less informative error message.

Personally, the fact that this has only just hit more-itertools counts as a point against this function to me: more-itertools is the "everything including the kitchen sink" grab-bag of iterator tools, and even they didn't think they needed "zip_equal" until version literally a few days ago. It's so new it isn't even documented yet:

https://pypi.org/project/more-itertools/
msg366949 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2020-04-21 22:56
> independent class

Oops, sorry I mean independent implementation.
msg371747 - (view) Author: Brandt Bucher (brandtbucher) * (Python committer) Date: 2020-06-17 14:51
Looks like two issues were created. I'm going to close this one in favor of 40636, which has PRs attached and is specific to PEP 618.
History
Date User Action Args
2020-06-17 14:51:12brandtbuchersetstatus: open -> closed
versions: + Python 3.10, - Python 3.9
superseder: Provide a strict form of zip (PEP-618) requiring same length inputs
messages: + msg371747

resolution: duplicate
stage: resolved
2020-04-21 22:56:29steven.dapranosetmessages: + msg366949
2020-04-21 22:55:41steven.dapranosetnosy: + steven.daprano
messages: + msg366948
2020-04-21 22:05:23brandtbuchersetmessages: + msg366946
2020-04-21 16:44:54cool-RRsetmessages: + msg366930
2020-04-21 16:39:58vstinnersetnosy: + vstinner
2020-04-21 16:36:40serhiy.storchakasetmessages: + msg366928
2020-04-21 16:35:22serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg366927
2020-04-21 16:32:58brandtbuchercreate