Author pombredanne
Recipients lars.gustaebel, pombredanne
Date 2015-06-26.09:18:56
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1435310337.24.0.0720447050888.issue24514@psf.upfronthosting.co.za>
In-reply-to
Content
The extraction fails when calling tarfile.open using this archive: http://archive.apache.org/dist/commons/logging/source/commons-logging-1.1.2-src.tar.gz

After some investigation, the file can be extracted with gnu tar and bsdtar and the gzip compression is not the issue: if I gunzip the tar.gz to a tar and call tarfile on plain tar, the problem is the same.

Also this archive was created most likely on Windows (based on the `file` command output) using some Java tools per http://commons.apache.org/proper/commons-logging/building.html from these original files: http://svn.apache.org/repos/asf/commons/proper/logging/tags/LOGGING_1_1_2/ ... that's all I could find out.


The error trace is slightly different on 2.7 and 3.4 but similar. 
The problem has been verified on Linux 64 with Python 2.7 and 3.4 and on Windows with Python 2.7.

On 2.7:

>>> TarFile.taropen(name)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/tarfile.py", line 1705, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/usr/lib/python2.7/tarfile.py", line 1574, in __init__
    self.firstmember = self.next()
  File "/usr/lib/python2.7/tarfile.py", line 2335, in next
    raise ReadError(str(e))
tarfile.ReadError: invalid header


On 3.4:

>>> TarFile.taropen(name)
Traceback (most recent call last):
  File "/usr/lib/python3.4/tarfile.py", line 180, in nti
    n = int(nts(s, "ascii", "strict") or "0", 8)
ValueError: invalid literal for int() with base 8: '       '

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.4/tarfile.py", line 2248, in next
    tarinfo = self.tarinfo.fromtarfile(self)
  File "/usr/lib/python3.4/tarfile.py", line 1083, in fromtarfile
    obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
  File "/usr/lib/python3.4/tarfile.py", line 1032, in frombuf
    obj.uid = nti(buf[108:116])
  File "/usr/lib/python3.4/tarfile.py", line 182, in nti
    raise InvalidHeaderError("invalid header")
tarfile.InvalidHeaderError: invalid header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.4/tarfile.py", line 1595, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/usr/lib/python3.4/tarfile.py", line 1469, in __init__
    self.firstmember = self.next()
  File "/usr/lib/python3.4/tarfile.py", line 2260, in next
    raise ReadError(str(e))
tarfile.ReadError: invalid header
History
Date User Action Args
2015-06-26 09:18:57pombredannesetrecipients: + pombredanne, lars.gustaebel
2015-06-26 09:18:57pombredannesetmessageid: <1435310337.24.0.0720447050888.issue24514@psf.upfronthosting.co.za>
2015-06-26 09:18:57pombredannelinkissue24514 messages
2015-06-26 09:18:56pombredannecreate