This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: tarfile.TarFile.getmembers misses some entries
Type: behavior Stage: resolved
Components: Versions: Python 3.2, Python 3.3
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: lars.gustaebel Nosy List: bins, lars.gustaebel, python-dev
Priority: normal Keywords:

Created on 2011-10-12 12:54 by bins, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (5)
msg145393 - (view) Author: Sebastien Binet (bins) Date: 2011-10-12 12:54
hi there,

it seems tarfile in python 3.2.2 (as installed in archlinux, but I don't see any additional patch applied on top of the vanilla sources:
) has troubles giving the complete content of a tar ball.

$ wget

$ md5sum boost_1_44_0.tar.gz 
085fce4ff2089375105d72475d730e15  boost_1_44_0.tar.gz

$ python --version
Python 3.2.2

$ python2 --version
Python 2.7.2

$ python ./
>>> 8145

$ python2 ./ 
>>> 33635

where is:
import tarfile
o ="boost_1_44_0.tar.gz")
print(">>> %s" % len(o.getmembers()))
## EOF ##

is it a known bug ?

(this of course prevents TarFile.extractall to be useful w/ python3...)

msg145447 - (view) Author: Sebastien Binet (bins) Date: 2011-10-13 08:28
one interesting additional piece of information is that if I un-tar that file and re-tar it w/o gzip compression, getmembers gets the right answer.

msg145503 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-10-14 10:54
New changeset 341008eab87d by Lars Gustäbel in branch '3.2':
Issue #13158: Fix decoding and encoding of base-256 number fields in tarfile.

New changeset 158430b2b552 by Lars Gustäbel in branch 'default':
Merge with 3.2: Issue #13158: Fix decoding and encoding of base-256 number fields in tarfile.
msg145504 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2011-10-14 10:58
Thanks for the report. There was a problem decoding a special and rare kind of header field in the archive. The format of the archive is of very bad quality BTW ;-)
msg145505 - (view) Author: Sebastien Binet (bins) Date: 2011-10-14 11:05

> The format of the archive is of very bad quality BTW ;-)
well, that's C++ :P

Date User Action Args
2022-04-11 14:57:22adminsetgithub: 57367
2011-10-14 11:05:30binssetmessages: + msg145505
2011-10-14 10:58:22lars.gustaebelsetstatus: open -> closed
resolution: fixed
messages: + msg145504

stage: resolved
2011-10-14 10:54:47python-devsetnosy: + python-dev
messages: + msg145503
2011-10-13 08:28:51binssetmessages: + msg145447
2011-10-13 08:21:33lars.gustaebelsetassignee: lars.gustaebel

nosy: + lars.gustaebel
versions: + Python 3.3
2011-10-12 12:54:32binscreate