classification
Title: itertools.combinations could be lazier
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.9, Python 3.5
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: furiel, rhettinger, xtreak
Priority: normal Keywords:

Created on 2019-07-24 18:18 by furiel, last changed 2019-07-25 00:58 by rhettinger. This issue is now closed.

Messages (4)
msg348395 - (view) Author: Antal Nemes (furiel) Date: 2019-07-24 18:18
Reproducible with current master (3.9, 151b91dfd21a100ecb1eba9e293c0a8695bf3bf5)

I would expect itertools.combinations to be lazy in the sense that it should not exhaust the input iterator in constructor time.

import itertools;
itertools.combinations(itertools.count(),2)

should return instantly. Instead it "hangs" until the process is killed.

Similarly, one can reproduce with the following simple crafted generator:

Python 3.9.0a0 (heads/master-dirty:151b91d, Jul 24 2019, 19:51:53) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> def mygenerator_with_exception():
    yield 2
    yield 2
    yield 3
    raise Exception("Should not be raised")
... ... ... ... ... 
>>> g = mygenerator_with_exception()
>>> import itertools             
>>> itertools.combinations(g,2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in mygenerator_with_exception
Exception: Should not be raised

Could you please consider making itertools.combinations truely lazy?
msg348397 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-07-24 18:26
See also related discussion https://bugs.python.org/msg276855
msg348401 - (view) Author: Antal Nemes (furiel) Date: 2019-07-24 18:59
Thanks for sharing the discussion above. I did not know this was discussed earlier.

Indeed, I do not come from a real world example. I ran into this problem while solving an online coding challenge that also measures performance. I got the right answer, just took too long time to calculate. In the crafted testcase the stop condition could have occurred within the first few elements of the input iterator, but the execution took longer because of exhausting all elements.

I could pass the challenge by adding my own lazy_combination, that is basically a wrapper around itertools.combinations. 

I came up with this version, which is of course incomplete for production. Not too difficult, but it was not straightforward either.

def lazy_combinations(g, n):
    try:
        known_elements = []
        for i in range(n-1):
            known_elements.append(next(g))

        while True:
            final_element = next(g)
            for i in itertools.combinations(known_elements, n-1):
                next_element = i+(final_element,)
                yield next_element
            known_elements.append(final_element)
    except StopIteration:
        pass

What I would like to say is the demand for such behavior might be out there. I understand that such feature would induce complexity for the core. However, if this is not part of the core, the complexity does not disappear, just manifests elsewhere: users might need to spend some time with the debugging, and also write some nontrivial code as a workaround.
msg348422 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-07-25 00:58
Thank you for the suggestion, but I'm going to decline for the reasons articulated in the link.

Consider posting you code to the Python package index where others can use if the need arises.
History
Date User Action Args
2019-07-25 00:58:37rhettingersetstatus: open -> closed
resolution: rejected
messages: + msg348422

stage: resolved
2019-07-24 18:59:33furielsetmessages: + msg348401
2019-07-24 18:26:36xtreaksetnosy: + xtreak
messages: + msg348397
2019-07-24 18:19:31xtreaksetnosy: + rhettinger
2019-07-24 18:18:17furielcreate