This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Unable to iterate over lines in a file without a block of code
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Dennis Sweeney, eric.smith, jaraco, njs, serhiy.storchaka
Priority: normal Keywords:

Created on 2022-01-08 04:09 by jaraco, last changed 2022-04-11 14:59 by admin.

Messages (8)
msg410075 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2022-01-08 04:09
I'd like to be able to do something pretty fundamental: lazily load lines from a file in a single expression.

Best I can tell, that's not possible in the language without triggering warnings.

One can use 'open' but that triggers a ResourceWarning:

```
$ $PYTHONWARNINGS='error' python -c "lines = open('/dev/null'); tuple(lines)"
Exception ignored in: <_io.FileIO name='/dev/null' mode='rb' closefd=True>
ResourceWarning: unclosed file <_io.TextIOWrapper name='/dev/null' mode='r' encoding='UTF-8'>
```

One can use a `with` statement, but that requires a block of code and can't be written easily in a single expression. One can use `pathlib.Path.read_text().splitlines()`, but that loads the whole file into memory.

This issue affected the pip-run project, which required 5 new lines in order to make such an expression possible (https://github.com/jaraco/pip-run/commit/e2f395d8814539e1da467ac09295922d8ccaf14d).

Can the standard library supply a function or method that would provide this behavior?
msg410078 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-01-08 06:54
Hi, Jason.

How about:

>>> from pathlib import Path
>>> Path("foo.txt").read_text().splitlines()
['how', 'now', 'brown', 'cow']

Not the most elegant thing, I'll admit.
msg410090 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022-01-08 10:13
A warning is an indication of possible bugs in your code. If you do not close file explicitly, and it is closed by the garbage collector, the time of closing is undeterminated. This can lead to exhausting of file descriptors if you have a lot of opened files waiting for closing in reference loops.

If you want to get rid of warnings, use a corresponding warning filter for ignoring specific warnings. Or better rewrite your code in a way that file closing is deterministic.
msg410114 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2022-01-08 20:42
Hi Eric. I did mention that option in my report, but that option requires loading the whole file into memory. I'd like something equivalent to the iterator that `open()` provides, which yields lines lazily.

Serihy, thanks for the feedback. I do indeed not want to rely on the implicit closing of the file handle. I'd instead like a helper function/method that will close the file after the iterator is consumed.

Something like:

def read_lines(path):
    with path.open() as strm:
        yield from strm

What I'm seeking is for that block to be placed somewhere in the stdlib so that I don't have to copy it into every project that needs/wants this behavior.
msg410115 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) Date: 2022-01-08 20:47
I think this might require something like PEP 533 in order to be safe.
msg410118 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2022-01-08 21:17
Nice reference. Indeed, the [rationale](https://www.python.org/dev/peps/pep-0533/#id15) of that pep gives a similar example and the [background](https://www.python.org/dev/peps/pep-0533/#id3) describes the problem with the solution I've drafted above (it still depends on garbage collection to close the file).

I guess that means that in general, it's not possible to accomplish what I'd like.

As author of the PEP, I'm looping in njs.
msg410122 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-01-08 22:32
I can't believe I missed that, Jason. I even read it twice!

I think this could go in pathlib, along with read_text. Maybe read_lines, or iter_lines, or something. Of course PEP 533 is needed, too.
msg410137 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2022-01-09 10:57
The safe way of using read_lines() is:

    lines = read_lines()
    try:
        # use lines
    finally:
        lines.close()

or

    with contextlib.closing(read_lines()) as lines:
        # use lines

And it is in no way better than using "with open()" directly.

I think it is better to not add such sing to the stdlib because it only makes an illusion of safety but actually removes safety guards.

If we want using generators in expressions we need to add support for "with" in expressions and comprehensions.

    data = json.load(f) with open(path, 'rb') as f
    lines = (line.strip() for path in files with open(path) as f for line in f)
History
Date User Action Args
2022-04-11 14:59:54adminsetgithub: 90462
2022-01-09 10:57:23serhiy.storchakasetmessages: + msg410137
2022-01-08 22:32:41eric.smithsetmessages: + msg410122
2022-01-08 21:17:57jaracosetnosy: + njs
messages: + msg410118
2022-01-08 20:47:48Dennis Sweeneysetnosy: + Dennis Sweeney
messages: + msg410115
2022-01-08 20:42:45jaracosettype: enhancement
messages: + msg410114
2022-01-08 10:13:41serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg410090
2022-01-08 06:54:56eric.smithsetnosy: + eric.smith
messages: + msg410078
2022-01-08 04:09:44jaracocreate