classification
Title: TarFile.extractfile fails to extract targets of top-level relative symlinks
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: lars.gustaebel Nosy List: Matthew.Miller, eric.araujo, lars.gustaebel, python-dev
Priority: normal Keywords: patch

Created on 2012-02-29 15:50 by Matthew.Miller, last changed 2012-04-24 20:46 by lars.gustaebel. This issue is now closed.

Files
File name Uploaded Description Edit
issue14160.diff lars.gustaebel, 2012-03-05 09:55
Messages (5)
msg154643 - (view) Author: Matthew Miller (Matthew.Miller) Date: 2012-02-29 15:50
I have a tarfile with relative paths. The tail of tar tvf looks like this:

-rw-r--r-T nobody/nobody  1356 2012-02-28 19:25 s/772
-rw-r--r-- nobody/nobody  1304 2012-02-28 19:25 s/773
-rw-r--r-- nobody/nobody  1304 2012-02-28 19:25 s/774
-rw-r--r-- nobody/nobody  1304 2012-02-28 19:25 s/775
lrw-r--r-- nobody/nobody     0 2012-02-28 19:25 final -> s/772

The docs say:

  TarFile.extractfile(member)

    Extract a member from the archive as a file object. member 
    may be a filename or a TarInfo object. If member is a regular
    file, a file-like object is returned. If member is a link, a 
    file-like object is constructed from the link’s target.

However, what I'm getting is this:

  KeyError: "linkname '/s/772' not found"


It's appending a "/". Why?

Well, in tarfile.py:

        if tarinfo.issym():
            # Always search the entire archive.
            linkname = os.path.dirname(tarinfo.name) + "/" + tarinfo.linkname
            limit = None


Here,  os.path.dirname(tarinfo.name) returns '', and then the "/" is appended, giving an incorrect result.

One solution would be:

   linkname = os.path.join(os.path.dirname(tarinfo.name),tarinfo.linkname)

but I don't think that works on platforms where os.sep is not "/", since tar will want "/" in any case. But that's the correct logic.

I'm filing this against 2.7, but the same issue exists in 3.2.

A work-around in end-user code is to call Tarfile.getmember(filename), check if the result issym(), and replace the result with Tarfile.getmember(tarinfo.linkname). But since the extractfile function is documented as following symlinks, that should not be necessary.

This bug isn't commonly encountered because by convention tarfiles usually contain a subdirectory and everything goes in that. But we should do the right thing.
msg154938 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2012-03-05 09:55
Thanks for the report. Attached is a patch (against 3.2) that is supposed to fix the problem.
msg159193 - (view) Author: Roundup Robot (python-dev) Date: 2012-04-24 19:09
New changeset 0adf4fd8df83 by Lars Gustäbel in branch '3.2':
Issue #14160: TarFile.extractfile() failed to resolve symbolic links
http://hg.python.org/cpython/rev/0adf4fd8df83

New changeset 38df99776901 by Lars Gustäbel in branch 'default':
Merge with 3.2: Issue #14160: TarFile.extractfile() failed to resolve symbolic
http://hg.python.org/cpython/rev/38df99776901
msg159205 - (view) Author: Roundup Robot (python-dev) Date: 2012-04-24 20:42
New changeset aff14bea5596 by Lars Gustäbel in branch '2.7':
Issue #14160: TarFile.extractfile() failed to resolve symbolic links when
http://hg.python.org/cpython/rev/aff14bea5596
msg159206 - (view) Author: Lars Gustäbel (lars.gustaebel) * (Python committer) Date: 2012-04-24 20:46
Fixed. Thanks for the report.
History
Date User Action Args
2012-04-24 20:46:40lars.gustaebelsetstatus: open -> closed
resolution: fixed
messages: + msg159206
2012-04-24 20:42:45python-devsetmessages: + msg159205
2012-04-24 19:09:35python-devsetnosy: + python-dev
messages: + msg159193
2012-03-05 09:56:06lars.gustaebelsetfiles: + issue14160.diff
keywords: + patch
messages: + msg154938

stage: patch review
2012-03-05 08:41:26lars.gustaebelsetassignee: lars.gustaebel
2012-03-01 23:44:22Matthew.Millersettype: behavior
2012-03-01 07:43:08eric.araujosetnosy: + lars.gustaebel, eric.araujo
title: Tarfile.extractfile fails to extract targets of top-level relative symlinks -> TarFile.extractfile fails to extract targets of top-level relative symlinks

versions: + Python 3.2, Python 3.3
2012-02-29 15:50:59Matthew.Millercreate