Title: Gzipping subprocess output produces invalid .gz file
Messages (2)
msg404853 - (view) Author: Michael Herrmann ( Date: 2021-10-23 06:13
Consider the following:

import gzip
import subprocess

with'test.gz', 'wb') as f:['echo', 'hi'], stdout=f)

with'test.gz', 'rb') as f:

I'd expect "hi" to appear in my console. Instead, I'm getting "OSError: Not a gzipped file (b'hi')". I am attaching test.gz.

This appears for me on Debian 10 / Python 3.7 and Debian 11 / Python 3.9. I have not yet tested on other OSs and Python versions.

The reason why I expect the above to work is that the subprocess documentation states that the stdout parameter may be "an existing file object" and that on the other hand the documentation for states that it returns a file object.

Maybe this is related to #40885?
msg404858 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2021-10-23 07:27
The subprocess module only uses the file object to get a file handle by calling the "fileno" method. See Issue 19992 about documenting this. For Python to compress the output of the child process, you would need a pipe.

Gzip file objects provide the "fileno" method, but it just returns the underlying file descriptor. Data written to that file descriptor would normally already be compressed by Python and goes straight to the OS. There is also Issue 24358 opened about whether "fileno" should be implemented.
