New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tarfile.extractall fails to overwrite symlinks #54970
Comments
tarfile.extractall overwrites normal files and directories, yet it fails to overwrite symlinks: [..] To reproduce, use a .tar.gz file containing relative (i.e., in the same directory) symlinks. Perhaps it should delete |
I just hit the same issue. This seems to work: Modified:Lib/tarfile.py ---Lib/tarfile.py 2011-04-26 20:36:33 UTC (rev 49502)
+++Lib/tarfile.py 2011-04-26 21:01:24 UTC (rev 49503)
@@ -2239,6 +2239,8 @@
if hasattr(os, "symlink") and hasattr(os, "link"):
# For systems that support symbolic and hard links.
if tarinfo.issym():
+ if os.path.exists(targetpath):
+ os.unlink(targetpath)
os.symlink(tarinfo.linkname, targetpath)
else:
# See extract(). |
Scott- which platform did you observe this? I can't reproduce this on the 2.7 code on Linux. |
It happens on RedHat and CentOS 5, but I suspect it would happen on any Unix variant. Here's a test that exacerbates the issue: # test_tarfile.py # Description: # import os
import shutil
import tarfile
import tempfile
def test_tar_overwrites_symlink():
d = tempfile.mkdtemp()
try:
new_dir_name = os.path.join(d, 'newdir')
archive_name = os.path.join(d, 'newdir.tar')
os.mkdir(new_dir_name)
source_file_name = os.path.join(new_dir_name, 'source')
target_link_name = os.path.join(new_dir_name, 'symlink')
with open(source_file_name, 'w') as f:
f.write('something\n')
os.symlink(source_file_name, target_link_name)
with tarfile.open(archive_name, 'w') as tar:
for f in [source_file_name, target_link_name]:
tar.add(f, arcname=os.path.basename(f))
# this should not raise OSError: [Errno 17] File exists
with tarfile.open(archive_name, 'r') as tar:
tar.extractall(path=new_dir_name)
finally:
shutil.rmtree(d) |
New changeset 0c8bc3a0130a by Senthil Kumaran in branch '2.7': |
I had tried/tested against 3.x branch and did not find the problem. Later realized that it was only again 2.7. Pushed in the changes and the tests. I shall the tests only in 3.x codeline. |
Senthil, Windows buildbots on 3.1, 3.2 and 3.x show test failures. |
New changeset 2665a28643b8 by Senthil Kumaran in branch 'default': |
I had wrapped skipUnless decorator for the wrong test (test_extractall Thank you. |
buildbots are green again. |
It turns out that my fix was at least one byte short of complete. If the target pathname is a broken symlink, os.path.exists() returns False, and the OSError is raised. I should have used os.path.lexists(). Also, I believe the same problem exists for the hardlink case a few lines below. |
here is a diff of a better fix based on the previous patch: Index: tarfile.py --- tarfile.py (revision 49758)
+++ tarfile.py (working copy)
@@ -2239,12 +2239,14 @@
if hasattr(os, "symlink") and hasattr(os, "link"):
# For systems that support symbolic and hard links.
if tarinfo.issym():
- if os.path.exists(targetpath):
+ if os.path.lexists(targetpath):
os.unlink(targetpath)
os.symlink(tarinfo.linkname, targetpath)
else:
# See extract().
if os.path.exists(tarinfo._link_target):
+ if os.path.lexists(targetpath):
+ os.unlink(targetpath)
os.link(tarinfo._link_target, targetpath)
else:
self._extract_member(self._find_link_target(tarinfo), targetpath) |
tests that verify the bug/fix: def test_extractall_broken_symlinks(self):
# Test if extractall works properly when tarfile contains symlinks
tempdir = os.path.join(TEMPDIR, "testsymlinks")
temparchive = os.path.join(TEMPDIR, "testsymlinks.tar")
os.mkdir(tempdir)
try:
source_file = os.path.join(tempdir,'source')
target_file = os.path.join(tempdir,'symlink')
with open(source_file,'w') as f:
f.write('something\n')
os.symlink(source_file, target_file)
tar = tarfile.open(temparchive,'w')
tar.add(target_file, arcname=os.path.basename(target_file))
tar.close()
# remove the real file
os.unlink(source_file)
# Let's extract it to the location which contains the symlink
tar = tarfile.open(temparchive,'r')
# this should not raise OSError: [Errno 17] File exists
try:
tar.extractall(path=tempdir)
except OSError:
self.fail("extractall failed with broken symlinked files")
finally:
tar.close()
finally:
os.unlink(temparchive)
shutil.rmtree(TEMPDIR)
def test_extractall_hardlinks(self):
# Test if extractall works properly when tarfile contains symlinks
TEMPDIR = tempfile.mkdtemp()
tempdir = os.path.join(TEMPDIR, "testsymlinks")
temparchive = os.path.join(TEMPDIR, "testsymlinks.tar")
os.mkdir(tempdir)
try:
source_file = os.path.join(tempdir,'source')
target_file = os.path.join(tempdir,'symlink')
with open(source_file,'w') as f:
f.write('something\n')
os.link(source_file, target_file)
tar = tarfile.open(temparchive,'w')
tar.add(source_file, arcname=os.path.basename(source_file))
tar.add(target_file, arcname=os.path.basename(target_file))
tar.close()
# Let's extract it to the location which contains the symlink
tar = tarfile.open(temparchive,'r')
# this should not raise OSError: [Errno 17] File exists
try:
tar.extractall(path=tempdir)
except OSError:
self.fail("extractall failed with linked files")
finally:
tar.close()
finally:
os.unlink(temparchive)
shutil.rmtree(TEMPDIR) |
oops... I left some of my local edits in those tests. be sure to fix the TEMPDIR use if you add these into the tarfile tests. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: