This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: shutil.rmtree will frequently fail on Windows under heavy load due to racy deletion
Type: Stage:
Components: Windows Versions: Python 3.9
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Alexander Riccio, eryksun, paul.moore, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2020-04-01 22:15 by Alexander Riccio, last changed 2022-04-11 14:59 by admin.

Messages (3)
msg365520 - (view) Author: Alexander Riccio (Alexander Riccio) * Date: 2020-04-01 22:15
The "obvious" way to delete a directory tree on Windows is wrong. It's inherently racy, since deleting a file on Windows *doesn't actually delete it*, instead it marks the file for deletion. The system will eventually get around to deleting it, but under heavy load, this might be sometime after an attempt is made to delete the parent directory. I've seen this (windows error 145, directory is not empty) many times when running the testsuite, and it causes all kinds of intermittent failures.

The best way to do things correctly is kinda nuts, and is explained well by Niall Douglass here:

In short, the correct way to do it involves moving each file to a randomly named file in %TEMP%, then deleting that, and then doing the same with each newly-empty directory.
msg365686 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2020-04-03 09:31
What about renaming the base directory in place? Moving things across drives doesn't help, and we can't reasonably determine a suitable location for temp files other than by leaving them where they are.
msg365697 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2020-04-03 13:09
> It's inherently racy, since deleting a file on Windows *doesn't 
> actually delete it*, instead it marks the file for deletion. The 
> system will eventually get around to deleting it, but under heavy 
> load, this might be sometime after an attempt is made to delete 
> the parent directory.

Commonly, WINAPI DeleteFileW and RemoveDirectoryW unlink the target file or directory synchronously. There are cases such as a watched directory and malware scanners that can make deleting asynchronous, but it's unlike the above characterization. It's not like the delete operation gets queued by the filesystem for the system to get around to sometime later.

Deleting or renaming a filesystem file/directory begins with creating a File object that's granted delete access. This is the first hurdle, since all existing File objects for a file have to share data access, including read/execute, write/append, and delete/rename access. Sharing delete/rename access is uncommon, and trying to delete an open file fails with ERROR_SHARING_VIOLATION (32).

If the caller has a handle with delete access, the next hurdle is being allowed to set the delete disposition on the underlying file/link control block (FCB/LCB) in the filesystem. This request will be denied with ERROR_ACCESS_DENIED (5) if the file is flagged as readonly or is currently memory-mapped as data or image. In these cases, a file can still be renamed within the filesystem, which is useful if there is a known destination path.

Assuming that setting the FCB/LCB delete disposition succeeds, then, with classic Windows delete semantics (as opposed to POSIX semantics in available in Windows 10), the file will be unlinked when the last File object that references the FCB/LCB gets cleaned up. This in turn occurs when the last handle for the last File object gets closed. 

WINAPI CloseHandle calls the NtClose system function, which synchronously calls a kernel object's close method, if implemented. A File object has a close method that, if it's the last handle for the object in the system, synchronously calls the filesystem device stack with an IRP_MJ_CLEANUP request. You can see how a filesystem cleanup function works in the source of the fastfat sample driver [1]. Pay attention to blocks in FatCommonCleanup that check the flag FCB_STATE_DELETE_ON_CLOSE in the UserDirectoryOpen and UserFileOpen cases. Note that even if the cleanup function were to complete asynchronously with STATUS_PENDING, the close method of the File object itself waits for completion. So the bases are covered to ensure deleting works synchronously in the common case when a file is referenced only by the handle that's used to delete it. This excludes the cases of pre-existing references that share delete access and implicit interference from malware scanners that inject themselves into the filesystem device stack.

In Windows 10, NTFS supports POSIX delete semantics [2], i.e. FILE_DISPOSITION_DELETE | FILE_DISPOSITION_POSIX_SEMANTICS. In this case, a delete request still has to pass the hurdles of the file sharing mode, readonly flag, and data/image file mappings. What changes is that the file will be unlinked as soon as the deleting handle is closed. Existing opens can continue to access data in the file. 

DeleteFileW in Windows 10 first tries to use a POSIX delete, though this is still undocumented. If a POSIX delete fails (e.g. it's a FAT32 filesystem), DeleteFileW falls back on a classic delete. 

RemoveDirectoryW is limited to a classic delete in Windows 10, so it's subject to race conditions with watched directories. In particular, Explorer keeps directories open to watch for changes. It shares delete access, which allows RemoveDirectoryW to successfully set the delete disposition of a watched directory. In turn, when a watch fails when the delete disposition is set, Explorer immediately closes its handle. This unlinks the directory, but it's a race condition.

When I have time, I'll check whether NTFS supports POSIX delete semantics on directories -- by directly calling NtSetInformationFile: FileDispositionInformationEx. If it does, then probably a future version of Windows will try to use a POSIX delete in RemoveDirectoryW, just as DeleteFileW current does. That will eliminate the problem with watched directories -- at least on the system volume (C:), which is required to use NTFS.

Date User Action Args
2022-04-11 14:59:28adminsetgithub: 84324
2020-04-03 13:09:32eryksunsetnosy: + eryksun
messages: + msg365697
2020-04-03 09:31:22steve.dowersetmessages: + msg365686
2020-04-01 22:15:23Alexander Ricciocreate