Issue 3907: "for line in file" doesn't work for pipes

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/48157

classification

Title:	"for line in file" doesn't work for pipes
Type:	behavior	Stage:
Components:	Interpreter Core	Versions:	Python 2.7

process

Status:	closed	Resolution:	works for me
Dependencies:		Superseder:
Assigned To:		Nosy List:	amaury.forgeotdarc, endolith, fmoreau
Priority:	normal	Keywords:

Created on 2008-09-19 02:58 by endolith, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (3)
msg73419 - (view)	Author: (endolith)	Date: 2008-09-19 02:58
One of the principles of Python is that "There should be one-- and preferably only one --obvious way to do it." It seems that the "for line in file" idiom is The Way to iterate over the lines of a file, and older more explicit methods are deprecated. PEP 234 says that this: for line in file: ... is equivalent to this: for line in iter(file.readline, ""): ... or this: while 1: line = file.readline() if not line: break ... However, "for line in file" does not behave the same as the other two if the file is a named pipe. This is presumably due to the "hidden read-ahead buffer" in the low-level implementation of the next() method of the file iterator (http://docs.python.org/lib/bltin-file-objects.html), meant to increase the speed at which it reads regular physical files. Since not enough data exists in the pipe to fill the buffer yet, the lines are only read in a burst after the buffer has been filled or when the pipe is closed. My application is monitoring a pipe for new lines from a logging program, and I want each line read as soon as it is written. Sure, there are other ways to get this functionality, but I don't see why "for line in file" shouldn't behave the same way for any file-like object. I wonder if it can be made to internally use the read-ahead buffer for closed physical files, and a different method for open named pipes. I wonder if reading pipes character-by-character causes any significant slowdown compared to the read-ahead buffer when the pipe resides in memory instead of a disk. Forgive me if this is not really a bug, but it seems to my beginner eyes that things are not working the way they should. http://python-forum.org/pythonforum/viewtopic.php?t=9300 http://ubuntuforums.org/showthread.php?t=916518
msg73423 - (view)	Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *	Date: 2008-09-19 08:35
Python 2.6 and 3.0 come with a completely new I/O implementation, which correctly handle pipes in this regard (I just tested). http://docs.python.org/dev/library/io.html With the 3.0 version, the built-in open() is an alias for io.open; with 2.6, you have to use io.open() explicitely.
msg208474 - (view)	Author: Francis Moreau (fmoreau)	Date: 2014-01-19 11:44
Sorry for reopening this bug, but I agree with the OP, and I can still see the exact same behaviour on python 2.7.6 (archlinux). At least, the documentation should clarify that doing "for line in file" is not strictly equivalent to the "readline" way regarding to the buffering policy used with pipes. I'm also dubious about the buffering optimisation for the pipe case but readline() documentation should state that it will never use such buffering mechanism so we can safely use it when dealing with pipe. Thanks

History
Date	User	Action	Args
2022-04-11 14:56:39	admin	set	github: 48157
2014-01-19 11:44:27	fmoreau	set	nosy: + fmoreau messages: + msg208474
2011-12-02 21:53:51	terry.reedy	set	versions: + Python 2.7, - Python 2.5
2008-09-19 08:35:12	amaury.forgeotdarc	set	status: open -> closed resolution: works for me messages: + msg73423 nosy: + amaury.forgeotdarc
2008-09-19 02:58:23	endolith	create