classification
Title: shutil.rmtree fails on symlink, after deleting contents
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: draghuram, georg.brandl, gvanrossum, ntkoopman
Priority: normal Keywords:

Created on 2007-12-20 08:53 by ntkoopman, last changed 2008-01-21 17:44 by draghuram. This issue is now closed.

Files
File name Uploaded Description Edit
rmtree.diff draghuram, 2007-12-20 17:25
Messages (11)
msg58864 - (view) Author: Tim Koopman (ntkoopman) Date: 2007-12-20 08:53
When using rmtree with a symlink to a directory as path, it will first
follow the symlink, (try to) remove all the contents of the source
directory and then raise the exception "OSError: [Errno 20] Not a
directory".

Expected behaviour: The function should only raise an exception.

If the current behaviour is indeed wanted, it should a least be documented.
msg58892 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2007-12-20 15:43
Please try to include stack trace in bug reports. I reproduced the error
on my Linux (SuSE 10). 

marvin:tmp$ ls -l dirlink testdir
lrwxrwxrwx 1 raghu users    7 2007-12-20 10:10 dirlink -> testdir/

testdir:
total 0
-rw-r--r-- 1 raghu users 0 2007-12-20 10:36 testfile

shutil.rmtree('dirlink') produces:

----------
Traceback (most recent call last):
  File "t.py", line 4, in <module>
    shutil.rmtree('dirlink')
  File "/localhome/raghu/localwork/cpython/python/Lib/shutil.py", line
194, in rmtree
    onerror(os.rmdir, path, sys.exc_info())
  File "/localhome/raghu/localwork/cpython/python/Lib/shutil.py", line
192, in rmtree
    os.rmdir(path)
OSError: [Errno 20] Not a directory: 'dirlink'
----------

While we are removing the contents of the target directory as expected,
the final 'rmdir' fails on removing the symbolic link. For the sake of
consistency, we can explicitly remove the target directory and leave the
symbolic link intact. Is this what you are expecting?
msg58900 - (view) Author: Tim Koopman (ntkoopman) Date: 2007-12-20 16:40
> While we are removing the contents of the target directory as expected,

This is not what I expected at all. I expected the function to fail,
because the target was not a directory, just a symlink to a directory.
That, or behavior similar to the command "rm -rf" which only removes the
symlink, but not the contents.

The current behavior is in my opinion inconsistent; if the symlink is
treated as a normal directory, it should also get deleted.

As I said, if the current behavior is apparently expected, it should be
documented, because I can think of no program that follow symlinks while
deleting unless specifically instructed to.
msg58903 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-12-20 17:25
I agree with Tesiph, more useful behavior would be to raise an error
immediately because the argument is not a directory. If you wanted to
remove the think linked to, you could use rmtree("foo/.",
ignore_errors=True).
msg58904 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2007-12-20 17:25
I am ok with disallowing symlinks in rmtree() because if it were to be
allowed, the semantics are not clear. In addition, neither 'rmdir' nor
'rm -rf' delete the target directory. 

The attached patch would raise error if symbolic link is passed to
rmtree. Even though there is a parameter 'ignore_errors", I don't think
it should be used in suppressing symbolic link error. Comments?
msg58908 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2007-12-20 17:43
Thanks for the patch. I think it should raise IOError, not ValueError,
and it should use the onerror() handling used for all other errors.
Also, can you include an update for the docs in the Doc tree?
msg58918 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2007-12-20 19:45
Index: Lib/shutil.py
===================================================================
--- Lib/shutil.py       (revision 59581)
+++ Lib/shutil.py       (working copy)
@@ -156,6 +156,16 @@
     elif onerror is None:
         def onerror(*args):
             raise
+
+    try:
+        if os.path.islink(path):
+            if ignore_errors:
+                return
+            else:
+                raise IOError('path can not be symbolic link')
+    except IOError, err:
+        onerror(os.path.islink, path, sys.exc_info())
+
     names = []
     try:
         names = os.listdir(path)
-------------------

How does this look? The error handling is slightly different for this
case because it can not continue if 'ignore_errors' is True. I will
update the doc if the code change is ok.

On Dec 20, 2007 12:43 PM, Guido van Rossum <report@bugs.python.org> wrote:
>
> Guido van Rossum added the comment:
>
> Thanks for the patch. I think it should raise IOError, not ValueError,
> and it should use the onerror() handling used for all other errors.
> Also, can you include an update for the docs in the Doc tree?
>
>
> __________________________________
> Tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1669>
> __________________________________
>
msg61298 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-01-20 14:18
Committed a variant of the last patch in r60139; it now raises OSError
like all other functions that are used in rmtree(). Added docs and tests.
msg61400 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-01-21 15:11
If rmtree() always returns in case of symbolic links (as it is
checked-in), wouldn't it be much simpler to totally do away with
'onerror' handling? I thought 'onerror' gives the user the option how to
proceed (which is true in other usages).
msg61410 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-01-21 16:47
Guido explicitly said "it should raise IOError, not ValueError,
and it should use the onerror() handling used for all other errors"
which makes sense for me too.
msg61428 - (view) Author: Raghuram Devarakonda (draghuram) (Python triager) Date: 2008-01-21 17:44
> and it should use the onerror() handling used for all other errors"
> which makes sense for me too.

Sure. I am ok with using 'onerror'. The point I am trying to make is
that there is slight difference between 'onerror' in this case and in
the three other places where it is called. In the case of symlink,
rmtree() returns even when 'onerror' doesn't raise exception. This is
not same in the other cases where rmtree() continues. I am just
wondering if this inconsistency is worth it and if it is, an explicit
mention in the doc will be nice.
History
Date User Action Args
2008-01-21 17:44:16draghuramsetmessages: + msg61428
2008-01-21 16:47:44georg.brandlsetmessages: + msg61410
2008-01-21 15:11:13draghuramsetmessages: + msg61400
2008-01-20 14:18:25georg.brandlsetstatus: open -> closed
nosy: + georg.brandl
resolution: fixed
messages: + msg61298
2007-12-20 19:45:45draghuramsetmessages: + msg58918
2007-12-20 17:43:40gvanrossumsetmessages: + msg58908
2007-12-20 17:25:50draghuramsetfiles: + rmtree.diff
messages: + msg58904
2007-12-20 17:25:12gvanrossumsetnosy: + gvanrossum
messages: + msg58903
2007-12-20 16:40:32ntkoopmansetmessages: + msg58900
2007-12-20 15:43:26draghuramsetnosy: + draghuram
messages: + msg58892
2007-12-20 08:53:39ntkoopmansettitle: shutils.rmtree fails on symlink, after deleting contents -> shutil.rmtree fails on symlink, after deleting contents
2007-12-20 08:53:21ntkoopmancreate