classification
Title: tarfile.open(fileobj=f) and bad metadata of the first file within the archive
Type: behavior Stage:
Components: Extension Modules Versions: Python 2.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: lars.gustaebel Nosy List: GeorgeNotaras, lars.gustaebel
Priority: normal Keywords:

Created on 2007-11-30 23:00 by GeorgeNotaras, last changed 2007-12-02 01:39 by GeorgeNotaras. This issue is now closed.

Messages (3)
msg58026 - (view) Author: George Notaras (GeorgeNotaras) Date: 2007-11-30 23:00
Assume the following situation:
- a healthy and uncompressed tar file: a.tar
- the metadata of the 1st and second files within the archive start at
positions 0 and 756 (realistic example values)

I partially damage 200 bytes of metadata (byte range 0-500) of the first
archived file:

f = open("a.tar", "rb+")
f.seek(100)
f.write("0"*200)

Now, I seek to the start of the 2nd archived file's metadata:

f.seek(756)

And I try to open the tar archive using tarfile.open() passing the
previous fileobject to it.

import tarfile
f_tar = tarfile.open(fileobj=f)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "tarfile.py", line 1143, in open
    raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully

Wouldn't the expected behaviour be to successfully open the tar archive
at offset 756?

It seems that tarfile.open(fileobj=f) seeks to position 0 of the
fileobject f, fails to read the 1st archived file's metadata and throws
an exception.
msg58069 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2007-12-01 21:15
I fixed this in the trunk (r59260) and release25-maint branch (r59261).
Thanks for the report. If you cannot wait for the next release, I
recommend you use mode "r|" as a workaround.

BTW, 756 is absolutely no realistic example value for the position of
the second member. A header block must start on a 512-byte boundary.
msg58076 - (view) Author: George Notaras (GeorgeNotaras) Date: 2007-12-02 01:39
Thanks for the quick fix and the workaround.

You are right about position 756. I hadn't spent enough time studying
the ''ustar'' format.
History
Date User Action Args
2007-12-02 01:39:45GeorgeNotarassetmessages: + msg58076
2007-12-01 21:15:56lars.gustaebelsetstatus: open -> closed
resolution: fixed
messages: + msg58069
2007-12-01 10:16:07lars.gustaebelsetassignee: lars.gustaebel
nosy: + lars.gustaebel
2007-11-30 23:00:06GeorgeNotarascreate