New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify that readlines() is not needed to iterate over a file #57719
Comments
I've been looking at code on the tutor mailing list for some time, and for line in file.readlines(): ... is a common idiom there. I suppose this is because the readlines() method is easily discoverable while the proper way (iterate over the file object directly) is not. A note added to the readlines() documentation might help: """ |
This current line is I would like to see added |
I am skeptical that such a note will help. The iterator behavior is clearly pointed out in the Python Tutorial [1] and in the IOBase documentation [2]. I suspect this bad code pattern is just being copied and pasted from other sources without folks ever even looking at the Python documentation. [1] http://docs.python.org/dev/tutorial/inputoutput.html?highlight=readlines#methods-of-file-objects |
FWIW I've seen several persons using "for line in file.readlines(): ..." or even "while 1: line = file.readline()". IMHO it's a good idea to document that "without sizehint, it's equivalent to list(file)" and that "for line in file: ..." can be used directly. Even if some people don't read the doc, the ones who do will benefit from this. The same note might also be added to the docstring (I think it's somewhat common to learn about readlines() through dir(file) + help(file.readlines)). |
file.readlines() (and perhaps dare I say even file.readline()) should not even be mentioned in the tutorial, IMO. It is difficult to imagine a use case where just iterating over the file object isn't superior. I cannot remember the last time that I have used either of these methods. They ought to be relegated to the library docs. Presenting `for line in a_file:` as merely "An alternative approach" in the official tutorial is practically archaic. |
readlines might be discouraged; readline has its use cases (two days ago I used it to get one line to pass to csv.Sniffer.sniff), but next(file) works too, so it could be de-emphasized. There may be a difference with respect to the trailing newline however; I don’t remember if __iter__ or readline keep it or not. |
Working on a patch for this. |
After reading the comments, I generated a patch that does the following:
Any feedback is more then welcome! Thanks. |
In 3.x I think we'll want to drop the following sentence: "Since the two approaches manage line buffering differently, they should not be mixed". But it can wait for another issue. |
I would actually remove the whole section about readlines() or possibly just mention it briefly (something like "If you want to read all the lines of a file in a list you can also use f.readlines().") |
You are right, Ezio. |
I agree with removing .readlines section. If .readlines did not exist, I do not think we would add it now. |
Added a new version of the patch to incorporate Ezio's comment! |
Patch LGTM. I think it would also be better to say something like "Note that it's already possible to iterate on file objects using |
Agreed Ezio, I've updated the patch to include the change to Doc/library/io.rst:readlines. |
New changeset 1e8be05a4039 by Ezio Melotti in branch '3.3': New changeset 7f4325dc4256 by Ezio Melotti in branch 'default': New changeset 6a4746b0afaf by Ezio Melotti in branch '2.7': |
Fixed, thanks for the patch! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: