This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: tarfile creates output that appears to omit files
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: mcr314, zach.ware
Priority: normal Keywords:

Created on 2020-06-08 21:15 by mcr314, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (2)
msg371045 - (view) Author: Michael Richardson (mcr314) Date: 2020-06-08 21:15
The simplest tarcopy program seems to result in output that GNU tar, bsdtar, and even Emacs tar-mode is unable to correctly process.
It appears that the resulting tar file is missing files, but examination of the raw output shows they might be there, but just corrupt.
GNU tar actually complains while reading the file.
   https://github.com/mcr/python3-tar-copy-failure

has a test case.  Here is the stupid code to reproduce it:

import tarfile
out = tarfile.open(name="./t2.tar", mode="w", format=tarfile.PAX_FORMAT)
with tarfile.open("./t1.tar") as tar:
    for file in tar.getmembers():
        print (file.name)
        out.addfile(file)
out.close()

This has been confirmed on python 3.6.9 (Ubuntu 18.04 LTS), and python 3.7.3 (Devuan Beowulf).  It seems to omit different files on 32-bit and 64-bit systems.
msg371047 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2020-06-08 21:27
Note that `TarFile.getmembers()` is documented to return `TarInfo` objects, which are documented as explicitly *not* including file data.  Try replacing `out.addfile(file)` with `out.addfile(file, tar.extractfile(file))`.
History
Date User Action Args
2022-04-11 14:59:32adminsetgithub: 85091
2020-06-15 19:11:24zach.waresetstatus: open -> closed
resolution: not a bug
stage: resolved
2020-06-08 21:27:31zach.waresetnosy: + zach.ware
messages: + msg371047
2020-06-08 21:15:26mcr314setversions: + Python 3.7
2020-06-08 21:15:00mcr314create