Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tarfile.extractall fails to overwrite symlinks #54970

Closed
srid mannequin opened this issue Dec 23, 2010 · 14 comments
Closed

tarfile.extractall fails to overwrite symlinks #54970

srid mannequin opened this issue Dec 23, 2010 · 14 comments
Assignees
Labels
3.8 only security fixes performance Performance or resource usage stdlib Python modules in the Lib dir topic-unicode

Comments

@srid
Copy link
Mannequin

srid mannequin commented Dec 23, 2010

BPO 10761
Nosy @gustaebel, @orsenthil, @pitrou, @vstinner, @ezio-melotti

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/orsenthil'
closed_at = <Date 2011-04-30.01:17:03.978>
created_at = <Date 2010-12-23.00:17:11.008>
labels = ['3.8', 'library', 'expert-unicode', 'performance']
title = 'tarfile.extractall fails to overwrite symlinks'
updated_at = <Date 2017-11-19.13:33:04.923>
user = 'https://bugs.python.org/srid'

bugs.python.org fields:

activity = <Date 2017-11-19.13:33:04.923>
actor = 'Chris Albright'
assignee = 'orsenthil'
closed = True
closed_date = <Date 2011-04-30.01:17:03.978>
closer = 'orsenthil'
components = ['Library (Lib)', 'Unicode']
creation = <Date 2010-12-23.00:17:11.008>
creator = 'srid'
dependencies = []
files = []
hgrepos = []
issue_num = 10761
keywords = []
message_count = 14.0
messages = ['124523', '134502', '134519', '134555', '134657', '134658', '134782', '134816', '134817', '134828', '135924', '135925', '135926', '135936']
nosy_count = 9.0
nosy_names = ['lars.gustaebel', 'orsenthil', 'pitrou', 'vstinner', 'ezio.melotti', 'srid', 'santoso.wijaya', 'python-dev', 'Scott.Leerssen']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue10761'
versions = ['Python 2.7', 'Python 3.8']

@srid
Copy link
Mannequin Author

srid mannequin commented Dec 23, 2010

tarfile.extractall overwrites normal files and directories, yet it fails to overwrite symlinks:

[..]
tf.extractall()
File "/opt/ActivePython-2.7/lib/python2.7/tarfile.py", line 2046, in extractall
self.extract(tarinfo, path)
File "/opt/ActivePython-2.7/lib/python2.7/tarfile.py", line 2083, in extract
self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
File "/opt/ActivePython-2.7/lib/python2.7/tarfile.py", line 2167, in _extract_member
self.makelink(tarinfo, targetpath)
File "/opt/ActivePython-2.7/lib/python2.7/tarfile.py", line 2243, in makelink
os.symlink(tarinfo.linkname, targetpath)
OSError: [Errno 17] File exists

To reproduce, use a .tar.gz file containing relative (i.e., in the same directory) symlinks.

Perhaps it should delete targetpath before attempting to create a symlink.

@srid srid mannequin added type-bug An unexpected behavior, bug, or error stdlib Python modules in the Lib dir labels Dec 23, 2010
@gustaebel gustaebel mannequin self-assigned this Jan 4, 2011
@ScottLeerssen
Copy link
Mannequin

ScottLeerssen mannequin commented Apr 26, 2011

I just hit the same issue. This seems to work:

Modified:Lib/tarfile.py
===================================================================

---Lib/tarfile.py	2011-04-26 20:36:33 UTC (rev 49502)
+++Lib/tarfile.py	2011-04-26 21:01:24 UTC (rev 49503)
@@ -2239,6 +2239,8 @@
        if hasattr(os, "symlink") and hasattr(os, "link"):
            # For systems that support symbolic and hard links.
            if tarinfo.issym():
+                if os.path.exists(targetpath):
+                    os.unlink(targetpath)
                os.symlink(tarinfo.linkname, targetpath)
            else:
                # See extract().

@orsenthil
Copy link
Member

Scott- which platform did you observe this? I can't reproduce this on the 2.7 code on Linux.

@ScottLeerssen
Copy link
Mannequin

ScottLeerssen mannequin commented Apr 27, 2011

It happens on RedHat and CentOS 5, but I suspect it would happen on any Unix variant. Here's a test that exacerbates the issue:

# test_tarfile.py

# Description:
# tests for python tarfile module

# $Id$

import os
import shutil
import tarfile
import tempfile

def test_tar_overwrites_symlink():
    d = tempfile.mkdtemp()
    try:
        new_dir_name = os.path.join(d, 'newdir')
        archive_name = os.path.join(d, 'newdir.tar')
        os.mkdir(new_dir_name)
        source_file_name = os.path.join(new_dir_name, 'source')
        target_link_name = os.path.join(new_dir_name, 'symlink')
        with open(source_file_name, 'w') as f:
            f.write('something\n')
        os.symlink(source_file_name, target_link_name)
        with tarfile.open(archive_name, 'w') as tar:
            for f in [source_file_name, target_link_name]:
                tar.add(f, arcname=os.path.basename(f))
        # this should not raise OSError: [Errno 17] File exists
        with tarfile.open(archive_name, 'r') as tar:
            tar.extractall(path=new_dir_name)
    finally:
        shutil.rmtree(d)

@python-dev
Copy link
Mannequin

python-dev mannequin commented Apr 28, 2011

New changeset 0c8bc3a0130a by Senthil Kumaran in branch '2.7':
Fix closes bpo-10761: tarfile.extractall failure when symlinked files are present.
http://hg.python.org/cpython/rev/0c8bc3a0130a

@python-dev python-dev mannequin closed this as completed Apr 28, 2011
@orsenthil
Copy link
Member

I had tried/tested against 3.x branch and did not find the problem. Later realized that it was only again 2.7. Pushed in the changes and the tests. I shall the tests only in 3.x codeline.

@pitrou
Copy link
Member

pitrou commented Apr 29, 2011

Senthil, Windows buildbots on 3.1, 3.2 and 3.x show test failures.
See e.g. http://www.python.org/dev/buildbot/all/builders/x86%20XP-4%203.1/builds/1780/steps/test/logs/stdio

@pitrou pitrou reopened this Apr 29, 2011
@pitrou pitrou assigned orsenthil and unassigned gustaebel Apr 29, 2011
@python-dev
Copy link
Mannequin

python-dev mannequin commented Apr 29, 2011

New changeset 2665a28643b8 by Senthil Kumaran in branch 'default':
Wrap the correct test with the skip decorator for the bpo-10761.
http://hg.python.org/cpython/rev/2665a28643b8

@orsenthil
Copy link
Member

I had wrapped skipUnless decorator for the wrong test (test_extractall
instead of test_extractall_symlinks) in the 3.x code. Corrected it
and waiting for next bb reports.

Thank you.

@orsenthil
Copy link
Member

buildbots are green again.

@ScottLeerssen
Copy link
Mannequin

ScottLeerssen mannequin commented May 13, 2011

It turns out that my fix was at least one byte short of complete. If the target pathname is a broken symlink, os.path.exists() returns False, and the OSError is raised. I should have used os.path.lexists(). Also, I believe the same problem exists for the hardlink case a few lines below.

@ScottLeerssen
Copy link
Mannequin

ScottLeerssen mannequin commented May 13, 2011

here is a diff of a better fix based on the previous patch:

Index: tarfile.py
===================================================================

--- tarfile.py  (revision 49758)
+++ tarfile.py  (working copy)
@@ -2239,12 +2239,14 @@
         if hasattr(os, "symlink") and hasattr(os, "link"):
             # For systems that support symbolic and hard links.
             if tarinfo.issym():
-                if os.path.exists(targetpath):
+                if os.path.lexists(targetpath):
                     os.unlink(targetpath)
                 os.symlink(tarinfo.linkname, targetpath)
             else:
                 # See extract().
                 if os.path.exists(tarinfo._link_target):
+                    if os.path.lexists(targetpath):
+                        os.unlink(targetpath)
                     os.link(tarinfo._link_target, targetpath)
                 else:
                     self._extract_member(self._find_link_target(tarinfo), targetpath)

@ScottLeerssen
Copy link
Mannequin

ScottLeerssen mannequin commented May 13, 2011

tests that verify the bug/fix:

    def test_extractall_broken_symlinks(self):
        # Test if extractall works properly when tarfile contains symlinks
        tempdir = os.path.join(TEMPDIR, "testsymlinks")
        temparchive = os.path.join(TEMPDIR, "testsymlinks.tar")
        os.mkdir(tempdir)
        try:
            source_file = os.path.join(tempdir,'source')
            target_file = os.path.join(tempdir,'symlink')
            with open(source_file,'w') as f:
                f.write('something\n')
            os.symlink(source_file, target_file)
            tar = tarfile.open(temparchive,'w')
            tar.add(target_file, arcname=os.path.basename(target_file))
            tar.close()
            # remove the real file
            os.unlink(source_file)
            # Let's extract it to the location which contains the symlink
            tar = tarfile.open(temparchive,'r')
            # this should not raise OSError: [Errno 17] File exists
            try:
                tar.extractall(path=tempdir)
            except OSError:
                self.fail("extractall failed with broken symlinked files")
            finally:
                tar.close()
        finally:
            os.unlink(temparchive)
            shutil.rmtree(TEMPDIR)

    def test_extractall_hardlinks(self):
        # Test if extractall works properly when tarfile contains symlinks
        TEMPDIR = tempfile.mkdtemp()
        tempdir = os.path.join(TEMPDIR, "testsymlinks")
        temparchive = os.path.join(TEMPDIR, "testsymlinks.tar")
        os.mkdir(tempdir)
        try:
            source_file = os.path.join(tempdir,'source')
            target_file = os.path.join(tempdir,'symlink')
            with open(source_file,'w') as f:
                f.write('something\n')
            os.link(source_file, target_file)
            tar = tarfile.open(temparchive,'w')
            tar.add(source_file, arcname=os.path.basename(source_file))
            tar.add(target_file, arcname=os.path.basename(target_file))
            tar.close()
            # Let's extract it to the location which contains the symlink
            tar = tarfile.open(temparchive,'r')
            # this should not raise OSError: [Errno 17] File exists
            try:
                tar.extractall(path=tempdir)
            except OSError:
                self.fail("extractall failed with linked files")
            finally:
                tar.close()
        finally:
            os.unlink(temparchive)
            shutil.rmtree(TEMPDIR)

@ScottLeerssen
Copy link
Mannequin

ScottLeerssen mannequin commented May 13, 2011

oops... I left some of my local edits in those tests. be sure to fix the TEMPDIR use if you add these into the tarfile tests.

@ChrisAlbright ChrisAlbright mannequin added topic-unicode 3.8 only security fixes performance Performance or resource usage and removed type-bug An unexpected behavior, bug, or error labels Nov 19, 2017
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 only security fixes performance Performance or resource usage stdlib Python modules in the Lib dir topic-unicode
Projects
None yet
Development

No branches or pull requests

2 participants