2012-09-03
The tarfile module silently truncates the list of entries when reading a tar file if it sees an entry with a uid/gid field containing only spaces/NULs.  I got such a tarball from Java Maven/plexus-archiver.  I don't know whether they write such fields deliberately, but it seems reasonable to me, especially since they were providing the user/group names textually.

I'd like to see two fixes - a None/-1/0 value for the uid/gid and not silently swallowing HeaderErrors in (or at least documenting why it's being done).  0 would be consistent with the default value when writing, but None seems more honest.  -1 seems hard to defend.

Only tested on silly Python versions (2.6, PyPy-1.8), sorry.  It's what I've got to hand, but I think this issue also applies to recent Python too going by looking at the hg trunk.
