This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Gzipping subprocess output produces invalid .gz file
Type: behavior Stage:
Components: Versions: Python 3.9, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: martin.panter, mherrmann.at
Priority: normal Keywords:

Created on 2021-10-23 06:13 by mherrmann.at, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
test.gz mherrmann.at, 2021-10-23 06:13
Messages (2)
msg404853 - (view) Author: Michael Herrmann (mherrmann.at) Date: 2021-10-23 06:13
Consider the following:

import gzip
import subprocess

with gzip.open('test.gz', 'wb') as f:
    subprocess.run(['echo', 'hi'], stdout=f)

with gzip.open('test.gz', 'rb') as f:
    print(f.read())

I'd expect "hi" to appear in my console. Instead, I'm getting "OSError: Not a gzipped file (b'hi')". I am attaching test.gz.

This appears for me on Debian 10 / Python 3.7 and Debian 11 / Python 3.9. I have not yet tested on other OSs and Python versions.

The reason why I expect the above to work is that the subprocess documentation states that the stdout parameter may be "an existing file object" and that on the other hand the documentation for gzip.open(...) states that it returns a file object.

Maybe this is related to #40885?
msg404858 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2021-10-23 07:27
The subprocess module only uses the file object to get a file handle by calling the "fileno" method. See Issue 19992 about documenting this. For Python to compress the output of the child process, you would need a pipe.

Gzip file objects provide the "fileno" method, but it just returns the underlying file descriptor. Data written to that file descriptor would normally already be compressed by Python and goes straight to the OS. There is also Issue 24358 opened about whether "fileno" should be implemented.
History
Date User Action Args
2022-04-11 14:59:51adminsetgithub: 89748
2021-10-23 07:27:35martin.pantersetnosy: + martin.panter
messages: + msg404858
2021-10-23 06:13:57mherrmann.atcreate