Title: shutil.rmtree fails when target has an internal directory junction (Windows)
Type: behavior Stage: patch review
Components: Library (Lib), Windows Versions: Python 3.8, Python 3.7, Python 3.6
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, paul.moore, steve.dower, tim.golden, vidartf, zach.ware
Priority: normal Keywords: patch

Created on 2017-08-17 13:58 by vidartf, last changed 2018-09-04 19:27 by terry.reedy.

File name Uploaded Description Edit vidartf, 2017-08-17 13:58 Minimal working example
Pull Requests
URL Status Linked Edit
PR 5998 open vidartf, 2018-03-05 23:41
Messages (3)
msg300424 - (view) Author: Vidar Fauske (vidartf) * Date: 2017-08-17 13:58
On Windows (Windows 10 in my case), given the following directory structure:
- rootfolder
 - a
 - b
  - junc (directory junction to ../a)

a call to `shutil.rmtree('root')` will fail with an exception `FileNotFoundError: [WinError 3]`, in a call to `os.listdir()` in `_rmtree_unsafe`. See attached minimal working example.

Note that sorting order is important: A link in 'a' pointing to 'b' does not fail. This is because `os.listdir()` raises an exception for 'b/junc' when its target ('a') has already been deleted.

Also, note that this is only for junctions, not directory links (`mklink /J` vs `mklink /D`), because:
 - Directory links flag false in the `stat.S_ISDIR(os.lstat('b/junc').st_mode)` test while junctions do not.
 - `os.islink()` returns false for both junctions, while directory links do not.

Indicated Python versions are those which I have personally tested on, and observed this behavior.

Current use case: Deleting a folder tree generated by an external tool, which creates junction links as part of its normal operation ('lerna' tool for the 'npm' javascript package manager).
msg300450 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2017-08-17 17:56
Junctions are sometimes used as links (e.g. mklink /j) and sometimes as volume mount points (e.g. mountvol.exe). GetVolumePathName can be called to distinguish these cases. If a junction is a volume mount point, then its absolute path and volume path are the same. This test is already used in ntpath.ismount().

For a junction link, islink() should return true, readlink() should work, and S_ISDIR() should return false for the lstat() st_mode. For a junction mount point, islink() should return false, readlink() should not work, and S_ISDIR() should return true for the lstat() st_mode.

A helper function could be added in fileutils.c to determine whether a reparse point is a link, based on the file path and reparse tag. Then modify _Py_attribute_data_to_stat() to take `BOOL is_link` instead of `ULONG reparse_tag`.
msg315245 - (view) Author: Vidar Fauske (vidartf) * Date: 2018-04-13 13:03
A PR that fixes the issue according to the feedback from Eryk Sun is available. It does seem to have stranded a bit on the review side. That being said, would a bugfix for shutil.rmtree be appropriate? It is very annoying when junction points made by other tools break pip source install of packages (since pip calls shutil.rmtree on its temporary directory after a build).
Date User Action Args
2018-09-04 19:27:35terry.reedysetversions: + Python 3.8
2018-04-13 13:03:58vidartfsetmessages: + msg315245
2018-03-05 23:41:36vidartfsetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request5764
2017-08-17 17:57:09eryksunsetstage: test needed
type: behavior
components: - IO
versions: + Python 3.7, - Python 3.3
2017-08-17 17:56:04eryksunsetnosy: + eryksun
messages: + msg300450
2017-08-17 13:58:18vidartfcreate