Issue 41406: subprocess: Calling Popen.communicate() after Popen.stdout.read() returns an empty string

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/85578

classification

Title:	subprocess: Calling Popen.communicate() after Popen.stdout.read() returns an empty string
Type:	behavior	Stage:
Components:	Library (Lib)	Versions:	Python 3.10

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	Frost Ming, gregory.p.smith, gstarck, vstinner
Priority:	normal	Keywords:

Created on 2020-07-27 08:26 by Frost Ming, last changed 2022-04-11 14:59 by admin.

Messages (4)
msg374366 - (view)	Author: Frost Ming (Frost Ming) *	Date: 2020-07-27 08:26
The following snippet behaves differently between Windows and POSIX. import subprocess import time p = subprocess.Popen("ls -l", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) print(p.stdout.read(1)) # read 1 byte print(p.communicate()) # Returns empty output It works fine on Windows and Python 2.x(communicate() returning the remaining output). So from the best guess it should be the expected behavior. The reason behind this is that Popen.stdout is a BufferedReader. It stores all output in the buffer when calling read(). However, communicate() and the lower API _communicate() use a lower level method os.read() to get the output, which does not respect the underlying buffer. When an empty output is retrieved the file object is closed then. First time to submit a bug report and pardon me if I am getting anything wrong.
msg374436 - (view)	Author: Grégory Starck (gstarck) *	Date: 2020-07-27 23:12
also affecting 3.6
msg374769 - (view)	Author: STINNER Victor (vstinner) *	Date: 2020-08-03 23:30
Calling proc.communicate() after proc.stdout.read() doesn't seem to be supported. What is your use case? Why not just calling communicate()? Why not only using stdout directly?
msg374783 - (view)	Author: Gregory P. Smith (gregory.p.smith) *	Date: 2020-08-04 01:55
A workaround should be pass bufsize=0. There might be performance consequences. That depends on your read patterns and child process. If this is to be supported and fixed, the selectors used in POpen._communicate on the POSIX side presumably don't bother to look at buffered IO objects buffer. https://github.com/python/cpython/blob/master/Lib/subprocess.py#L1959 manually consuming data from the stdout and stderr buffers, if any, before entering that loop is probably a fix. Higher up the chain, should the https://docs.python.org/3/library/selectors.html be enhanced to support emptying the buffer on buffered IO objects? That sounds complicated; probably even infeasible if in text mode. In general it is understood that poll/select type APIs are meant to be used on unbuffered raw binary file objects.

History
Date	User	Action	Args
2022-04-11 14:59:34	admin	set	github: 85578
2020-08-04 01:55:39	gregory.p.smith	set	messages: + msg374783
2020-08-03 23:30:07	vstinner	set	title: BufferedReader causes Popen.communicate losing the remaining output. -> subprocess: Calling Popen.communicate() after Popen.stdout.read() returns an empty string nosy: + gregory.p.smith messages: + msg374769 versions: - Python 3.6, Python 3.7, Python 3.8, Python 3.9 components: - 2to3 (2.x to 3.x conversion tool), IO
2020-07-27 23:12:07	gstarck	set	nosy: + gstarck messages: + msg374436 versions: + Python 3.6
2020-07-27 16:41:48	brett.cannon	set	nosy: - brett.cannon
2020-07-27 08:27:52	Frost Ming	set	type: behavior
2020-07-27 08:26:25	Frost Ming	create