This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author bartolsthoorn
Recipients bartolsthoorn
Date 2014-09-23.08:49:51
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1411462192.65.0.458428981621.issue22468@psf.upfronthosting.co.za>
In-reply-to
Content
CPython tarfile `gettarinfo` method uses fstat to determine the size of a file (using its fileobject). When that file object is actually created with Gzip.open (so a GZipfile), it will get the compressed size of the file. The addfile method will then continue to read the uncompressed data of the gzipped file, but will read too few bytes, resulting in a tar of incomplete files.

I suggest checking the file object class before using fstat to determine the size, and raise a warning if it's a gzip file.

To clarify, this only happens when adding a GZip file object to tar. I know that it's not a really common scenario, and the problem is really that GZip file size can only properly be determined by uncompressing and reading it entirely, but I think it's nice to not fail without warning.

So this is an example that is failing:
```
import tarfile
c = io.BytesIO()
with tarfile.open(mode='w', fileobj=c) as tar:
  for textfile in ['1.txt.gz', '2.txt.gz']:
    with gzip.open(textfile) as f:
      tarinfo = tar.gettarinfo(fileobj=f)
      tar.addfile(tarinfo=tarinfo, fileobj=f)
  data = c.getvalue()
return data
```

Instead this reads the proper filesize and writes the files to a tar:
```
import tarfile
c = io.BytesIO()
with tarfile.open(mode='w', fileobj=c) as tar:
  for textfile in ['1.txt.gz', '2.txt.gz']:
    with gzip.open(textfile) as f:
      buff = f.read()
      tarinfo = tarfile.TarInfo(name=f.name)
      tarinfo.size = len(buff)
      tar.addfile(tarinfo=tarinfo, fileobj=io.BytesIO(buff))
  data = c.getvalue()
return data
```
History
Date User Action Args
2014-09-23 08:49:52bartolsthoornsetrecipients: + bartolsthoorn
2014-09-23 08:49:52bartolsthoornsetmessageid: <1411462192.65.0.458428981621.issue22468@psf.upfronthosting.co.za>
2014-09-23 08:49:52bartolsthoornlinkissue22468 messages
2014-09-23 08:49:51bartolsthoorncreate