classification
Title: fwalk breaks on dangling symlinks
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: hynek Nosy List: hynek, neologix, python-dev
Priority: normal Keywords: patch

Created on 2012-05-10 19:28 by hynek, last changed 2012-05-16 09:39 by hynek. This issue is now closed.

Files
File name Uploaded Description Edit
make-fwalk-handle-dangling-symlinks.diff hynek, 2012-05-10 19:28 review
fwalk-ignore-missing-files.diff hynek, 2012-05-12 22:37 review
fwalk.diff hynek, 2012-05-15 13:29 review
Messages (7)
msg160364 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012-05-10 19:28
I'm implementing a safe rmtree using fwalk. Everything works perfectly except for one thing: if the directory contains dangling symlinks, fwalk goes belly-up:

$ ls -l test/
total 0
lrwxrwxrwx 1 vagrant vagrant 4 May 10 16:36 doesntwork -> this

$ ./python
Python 3.3.0a3+ (default:b32baa5b7626+, May 10 2012, 14:56:20) 
[GCC 4.6.3] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
[71253 refs]
>>> list(os.fwalk('test'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/vagrant/p/Lib/os.py", line 342, in fwalk
    for x in _fwalk(topfd, top, topdown, onerror, followlinks):
  File "/home/vagrant/p/Lib/os.py", line 361, in _walk
    if st.S_ISDIR(fstatat(topfd, name).st_mode):
FileNotFoundError: [Errno 2] No such file or directory


Unfortunately this makes it impossible to implement rmtree. The reason is the following code:

for name in names:
    # Here, we don't use AT_SYMLINK_NOFOLLOW to be consistent with
    # walk() which reports symlinks to directories as directories. We do
    # however check for symlinks before recursing into a subdirectory.
    if st.S_ISDIR(fstatat(topfd, name).st_mode):
        dirs.append(name)
    else:
        nondirs.append(name)

The "unsafe" walk tree uses os.path.isdir() instead of os.fstatat() and handles this case gracefully.

A simple try-except protection with a symlink check fixes it and the tests pass. This is a blocker for #4489. I have expanded the test of the WalkerTests suite.

Tested on Linux (= works) and OS X (= skipped).
msg160474 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-05-12 15:57
I'm not sure we really need to check for a dangling symlink in case of FileNotFoundError: whether it's a dangling symlink, or the file disappeared in-between-, skipping it is OK.
msg160488 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012-05-12 18:13
But if it is a dangling symlink, you want to add it to nondirs while missing files could be skipped, no? You can't skip dangling symlinks if you want to implement rmtree. The normal walk() doesn't too.
msg160499 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012-05-12 22:37
So I've changed the patch to ignore everything missing except for dangling links (which throw unfortunately the same exception).

Just to stress it once more: a fwalk that _ignores_ dangling symlinks is worthless for rmtree. And wasn't rmtree the initial reason to implement fwalk in the first place? ;)
msg160515 - (view) Author: Charles-François Natali (neologix) * (Python committer) Date: 2012-05-13 10:52
> Just to stress it once more: a fwalk that _ignores_ dangling symlinks is worthless for rmtree. And wasn't rmtree the initial reason to implement fwalk in the first place? ;)

Indeed, my bad :-)

> So I've changed the patch to ignore everything missing except for dangling links (which throw unfortunately the same exception).

Looks good to me.
msg160725 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012-05-15 13:29
I just realized it doesn't really make sense because if a file disappears for real, we'll get another FileNotFoundException when checking whether it's a symlink and the continue is never reached.

So behold v3. :)

This time, I have tested it by injecting a

if name == 'tmp4':
    import os
    os.unlinkat(topfd, name)

right before the S_ISDIR in fwalk.

Some tests failed because said tmp4 was obviously missing – the old code threw FileNotFoundExceptions. After restoration the whole test suite passes in regression mode.
msg160730 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-15 14:34
New changeset cbe7560d4443 by Hynek Schlawack in branch 'default':
#14773: Fix os.fwalk() failing on dangling symlinks
http://hg.python.org/cpython/rev/cbe7560d4443
History
Date User Action Args
2012-05-16 09:39:11hyneksetresolution: fixed
2012-05-16 09:36:17hyneksetstatus: open -> closed
stage: commit review -> resolved
2012-05-15 14:34:48python-devsetnosy: + python-dev
messages: + msg160730
2012-05-15 14:09:50pitrousetstage: patch review -> commit review
2012-05-15 13:29:22hyneksetfiles: + fwalk.diff

messages: + msg160725
stage: commit review -> patch review
2012-05-13 10:52:36neologixsetmessages: + msg160515
stage: patch review -> commit review
2012-05-12 22:37:09hyneksetfiles: + fwalk-ignore-missing-files.diff

messages: + msg160499
2012-05-12 18:13:49hyneksetmessages: + msg160488
2012-05-12 15:57:23neologixsetmessages: + msg160474
2012-05-10 19:31:59hyneklinkissue4489 dependencies
2012-05-10 19:28:39hynekcreate