classification
Title: Incomplete gzip output with tarfile.open(fileobj=..., mode="w:gz")
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: lars.gustaebel, martin.panter, nadeem.vawda, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2014-01-13 10:24 by martin.panter, last changed 2014-01-18 15:18 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
tarfile_fobj_gz_close.patch serhiy.storchaka, 2014-01-13 14:37 review
Messages (3)
msg208017 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2014-01-13 10:24
I am trying to create a tar file after opening it as a temporary file, and it seems to be writing truncated output when I use mode="w:gz". My workaround looks like it will be to use mode="w|gz" instead. It’s not clear what the difference is: am I losing anything practical by disallowing “random” seeking?

Simplified demonstration:

>>> from io import BytesIO; b = BytesIO()
>>> import tarfile; t = tarfile.open(fileobj=b, mode="w:gz")
>>> t._extfileobj
True
>>> type(t.fileobj)
<class 'gzip.GzipFile'>
>>> t.close()
>>> b.getvalue()
b'\x1f\x8b\x08\x00]\xb8\xd3R\x02\xff'
>>> del t
>>> b.getvalue()
b"\x1f\x8b\x08\x00]\xb8\xd3R\x02\xff\xed\xc1\x01\r\x00\x00\x00\xc2\xa0\xf7Om\x0e7\xa0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x807\x03\x9a\xde\x1d'\x00(\x00\x00"

Looking at the code, the TarFile.close() method would not be closing the GzipFile object because of this condition:

if not self._extfileobj:
    self.fileobj.close()

Perhaps it needs to also check if file compression is being used or something; I’m not familiar with the internals.

I did notice that the bug happens with Python 3.3.3 and 3.2.3, but not 2.7.5. Also, I do not see the issue when a file name is passed to tarfile.open() rather than a file object, nor do I see it with the bzip or XZ compressors, uncompressed tar creation, or the “special purposes” w|gz mode.
msg208026 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-13 14:37
Thank you Martin for you report.

Here is a patch which fixes this issue.
msg208400 - (view) Author: Roundup Robot (python-dev) Date: 2014-01-18 13:56
New changeset 5c69332dc3b0 by Serhiy Storchaka in branch '3.3':
Issue #20238: TarFile opened with external fileobj and "w:gz" mode didn't
http://hg.python.org/cpython/rev/5c69332dc3b0

New changeset e154b93f3857 by Serhiy Storchaka in branch 'default':
Issue #20238: TarFile opened with external fileobj and "w:gz" mode didn't
http://hg.python.org/cpython/rev/e154b93f3857

New changeset f7381f1bf1ec by Serhiy Storchaka in branch '2.7':
Backported test for issue #20238.
http://hg.python.org/cpython/rev/f7381f1bf1ec
History
Date User Action Args
2014-01-18 15:18:49serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2014-01-18 13:56:27python-devsetnosy: + python-dev
messages: + msg208400
2014-01-13 20:01:41serhiy.storchakalinkissue20243 dependencies
2014-01-13 14:37:55serhiy.storchakasetfiles: + tarfile_fobj_gz_close.patch
versions: + Python 3.4
messages: + msg208026

assignee: serhiy.storchaka
keywords: + patch
stage: patch review
2014-01-13 12:38:25serhiy.storchakasetnosy: + lars.gustaebel, nadeem.vawda, serhiy.storchaka
2014-01-13 10:24:43martin.pantercreate