This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add contextlib.itercm()
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: barry, ezio.melotti, martin.panter, ncoghlan, pitrou, rhettinger, serhiy.storchaka, yselivanov
Priority: normal Keywords:

Created on 2015-09-06 12:32 by ezio.melotti, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
itercm-example.txt ezio.melotti, 2015-09-06 12:32 Example usages of itercm().
Messages (11)
msg249991 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2015-09-06 12:32
Add an itercm() function that receives an iterable that supports the context manager protocol (e.g. files) and calls enter/exit without having to use the with statement explicitly.

The implementation is pretty straightforward (unless I'm missing something):

def itercm(cm):
    with cm:
        yield from cm

Example usages:

def cat(fnames):
    lines = chain.from_iterable(itercm(open(f)) for f in fnames)
    for line in lines:
        print(line, end='')

This will close the files as soon as the last line is read.

The __exit__ won't be called until the generator is exhausted, so the user should make sure that it is (if he wants __exit__ to be closed).  __exit__ is still called in case of exception.

Attached a clearer example of how it works.

Do you think this would be a good addition to contextlib (or perhaps itertools)?


P.S. I'm also contemplating the idea of having e.g. it = itercm(fname, func=open) to call func lazily once the first next(it) happens, but I haven't thought in detail about the implications of this.  I also haven't considered how this interacts with coroutines.
msg249992 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2015-09-06 13:14
FTR one of the reason that led me to itercm() is:

with open(fname) as f:
    transformed = (transform(line) for line in f)
    filtered = (line for line in lines if filter(line))
    # ...

Now filtered must be completely consumed before leaving the body of the `with` otherwise this happens:

>>> with open(fname) as f:
...     transformed = (transform(line) for line in f)
...     filtered = (line for line in lines if filter(line))
... 
>>> # ...
>>> next(filtered)
ValueError: I/O operation on closed file.

With itercm() it's possible to do:

f = itercm(open(fname))
transformed = (transform(line) for line in f)
filtered = (line for line in lines if filter(line))
...
# someone consumes filtered down the line lazily
# and eventually the file gets closed

itercm() could also be used (abused?) where a regular `with` would do just fine to save one extra line and indentation level (at the cost of an extra import), e.g.:

def lazy_cat(fnames):
    for fname in fnames:
        yield from itercm(open(fname))

instead of: 

def lazy_cat(fnames):
    for fname in fnames:
        with open(fname) as f:
            yield from f
msg249998 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-09-06 15:55
What if the last line will be never read? We had bugs with resource leaks in generators, and I'm not sure that all such bugs were fixed.
msg250012 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2015-09-06 19:15
If you are talking about generators that never get exhausted, the fact that the __exit__ is never invoked is expected and something that developers should take into account while using itercm().
I'm not aware of other generators-related issues that might cause leaks.
msg250027 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-09-06 22:13
I had also thought of this kind of function, but I dismissed it because it would either have to rely on garbage collection or an explicit close() call to close the generator in case the iteration is aborted. I think it might need some kind of “with-for” combined statement added to the langauge to be bulletproof.

Considering the second example in your script, “exit is called in case of errors”: What is stopping the interpreter from storing the iterator of the current “for” loop in the top-level frame object? Then the iterator would be referenced by the exception traceback, and prevent garbage collection of its itercm() instance. Hypothetically:

__traceback__ → tb_frame → “for” iterator → itercm() instance

Also, I would tend to put this sort of function in “itertools”, since generators are not context managers by design.
msg250028 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-09-06 22:30
__exit__ will still get invoked even if the generator is never exhausted:

>>> def itercm(cm):
...     with cm:
...         yield from cm
... 
>>> class TestCM:
...     def __iter__(self):
...         yield 6
...         yield 9
...         yield 42
...     def __enter__(self):
...         return self
...     def __exit__(self, *args):
...         print("Terminated CM")
... 
>>> itr = itercm(TestCM())
>>> next(itr)
6
>>> del itr
Terminated CM

We addressed the major problems with generators failing to clean up resources back when generator.close() was introduced in PEP 342, and then Antoine addressed the cyclic GC problem in PEP 442.

The key thing that itercm() adds over the status quo is that if the generator *is* exhausted, then the resource *will* be cleaned up immediately. If the generator *isn't* exhausted, then it falls back to non-deterministic GC based cleanup, which is what you'd get today by not using a context manager at all.

To be convinced that we need a third cleanup option beyond "always deterministic" and "always non-deterministic", I'd need some concrete use cases where the success case needs deterministic cleanup, but the error case is OK with non-deterministic cleanup.
msg250029 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-09-06 22:34
I don't think I like this idea. It's not really a common use case (I've never wished I had itercm()) and it will make it possible to write slightly obscure code. Of course you can already write obscure code using itertools :-)
msg250037 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-09-07 02:41
> I don't think I like this idea. It's not really a common use
> case (I've never wished I had itercm()) and it will make it 
> possible to write slightly obscure code. Of course you can
>  already write obscure code using itertools :-)

I concur with Antoine and think the world is better off without this one.
msg250144 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2015-09-08 02:04
TBH, I don't like the idea.  It would take me some time to digest the code every time I see this.

> lines = chain.from_iterable(itercm(open(f)) for f in fnames)

This looks like an extremely rare use case.
msg250145 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-09-08 02:44
Having spent a few days pondering this after Ezio first mentioned the concept to me on IRC, I'm rejecting this on the basis of "not every 3 line function needs to be in the standard library".

The vast majority of iterable CM use cases are going to fall into one of the following two categories:

* deterministic cleanup is needed regardless of success or failure
* non-deterministic cleanup is acceptable regardless of success or failure

In the former case, you don't continue iteration after the with statement ends, in the latter, you don't need to use the context manager aspects at all.

Those cases which don't fall into that category would generally be better served by an appropriately named helper function that conveys the author's intent, rather than describing what the function does internally:

    def chain_files(names):
        for name in names:
            with open(name) as f:
                yield from f

    def cat(fnames):
        for line in chain_files(fnames):
            print(line, end='')


The implementation of the helper function can then optionally be switched to a compositional approach without affecting the usage:

    def itercm(cm):
        with cm:
            yield from cm

    def iterfile(name):
        return itercm(open(name))

    def chain_files(names):
        return chain.from_iterable(iterfile(name) for name in names)

For the more concrete use case of "iterate over several files in sequence", we have the fileinput module: https://docs.python.org/3/library/fileinput.html

That already opens and closes files as it goes, so chain_files() can actually be implemented as:

    def chain_files(names):
        with fileinput.input(names) as files:
            return chain.from_iterable(files)
msg250187 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2015-09-08 12:02
> Having spent a few days pondering this after Ezio first mentioned the
> concept to me on IRC, I'm rejecting this on the basis of "not every 3 
> line function needs to be in the standard library".

When I first mentioned this to Nick on IRC, the implementation of itercm() was a not-so-trivial function that called __enter__/__exit__ manually while catching StopIteration.  It only occurred to me while posting this issue, that the same could be achieved with a simple `yield from` in a `with`.
I also didn't realize that the __exit__ called in case of error in the attached example was triggered by the garbage collector.
I therefore agree that a somewhat obscure and non-deterministic three-liner doesn't belong in the standard library.  Thanks everyone for the feedback!
History
Date User Action Args
2022-04-11 14:58:20adminsetgithub: 69202
2015-09-08 12:02:19ezio.melottisetmessages: + msg250187
2015-09-08 02:44:31ncoghlansetstatus: open -> closed
resolution: rejected
messages: + msg250145

stage: needs patch -> resolved
2015-09-08 02:04:12yselivanovsetnosy: + yselivanov
messages: + msg250144
2015-09-07 13:38:05barrysetnosy: + barry
2015-09-07 02:41:15rhettingersetmessages: + msg250037
2015-09-06 22:34:03pitrousetmessages: + msg250029
2015-09-06 22:30:35ncoghlansetmessages: + msg250028
2015-09-06 22:13:10martin.pantersetnosy: + martin.panter
messages: + msg250027
2015-09-06 19:15:24ezio.melottisetmessages: + msg250012
2015-09-06 15:55:01serhiy.storchakasetnosy: + serhiy.storchaka, pitrou
messages: + msg249998
2015-09-06 13:14:21ezio.melottisetmessages: + msg249992
2015-09-06 12:32:15ezio.melotticreate