Issue26111
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2016-01-14 18:45 by remyroy, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Messages (10) | |||
---|---|---|---|
msg258212 - (view) | Author: Remy Roy (remyroy) | Date: 2016-01-14 18:45 | |
On Windows, os.scandir will keep a handle on the directory being scanned until the iterator is exhausted. This behavior can cause various problems if try to use some filesystem calls like os.chmod or os.remove on the directory while the handle is still being kept. There are some use cases where the iterator is not going to be exhausted like looking for a specific entry in a directory and breaking from the loop prematurely. This behavior should at least be documented. Alternatively, it might be interesting to provide a way prematurely end the scan without having to exhaust it and close the handle. As a workaround, you can force the exhaustion after you are done with the iterator with something like: for entry in iterator: pass This is going to affect os.walk as well since it uses os.scandir . The original github issue can be found on https://github.com/benhoyt/scandir/issues/58 . |
|||
msg258219 - (view) | Author: Eryk Sun (eryksun) * | Date: 2016-01-14 20:50 | |
If you own the only reference you can also delete the reference, which deallocates the iterator and closes the handle. Can you provide concrete examples where os.remove and os.chmod fail? At least in Windows 7 and 10 the directory handle is opened with the normal read and write sharing, but also with delete sharing. This sharing mode is fairly close to POSIX behavior (an important distinction is noted below). I get the following results in Windows 10: >>> import os, stat >>> os.mkdir('test') >>> f = open('test/file1', 'w'); f.close() >>> f = open('test/file2', 'w'); f.close() >>> it = os.scandir('test') >>> next(it) <DirEntry 'file1'> rename, chmod, and rmdir operations succeed: >>> os.rename('test', 'spam') >>> os.chmod('spam', stat.S_IREAD) >>> os.chmod('spam', stat.S_IWRITE) >>> os.remove('spam/file1') >>> os.remove('spam/file2') >>> os.rmdir('spam') Apparently cached entries can be an issue, but this caching is up to WinAPI FindNextFile and the system call NtQueryDirectoryFile: >>> next(it) <DirEntry 'file2'> An important distinction is that a deleted file in Windows doesn't actually get unlinked until all handles and kernel pointer references are closed. Also, once the delete disposition is set, no *new* handles can be created for the existing file or directory (all access is denied), and a new file or directory with same name cannot be created. >>> os.listdir('spam') Traceback (most recent call last): File "<stdin>", line 1, in <module> PermissionError: [WinError 5] Access is denied: 'spam' >>> f = open('spam', 'w') Traceback (most recent call last): File "<stdin>", line 1, in <module> PermissionError: [Errno 13] Permission denied: 'spam' If we had another handle we could use that to rename "spam" to get it out of the way, at least. Without that, AFAIK, all we can do is deallocate the iterator or wait for it to be exhausted, which closes the handle and thus allows Windows to finally unlink "spam": >>> next(it) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration Creating a new file named "spam" is allowed now: >>> f = open('spam', 'w') >>> f.close() |
|||
msg258225 - (view) | Author: Martin Panter (martin.panter) * | Date: 2016-01-14 21:35 | |
Remy, is this the same problem described in Issue 25994? There a close() method (like on generators) and/or context manager support is proposed for the scandir() iterator. Perhaps we can keep this issue open for adding a warning to the documentation, and the other issue can be for improving the API in 3.6. |
|||
msg258226 - (view) | Author: Remy Roy (remyroy) | Date: 2016-01-14 21:46 | |
I believe Eryk's explanation on how a file in Windows doesn't actually get unlinked until all handles and kernel pointer references are closed is spot on about the problem I had. I had a complex example that could probably have been simplified to what Eryk posted. That behavior on Windows is quite counterintuitive. I'm not sure about what can be done to help it. |
|||
msg258228 - (view) | Author: Remy Roy (remyroy) | Date: 2016-01-14 21:54 | |
This issue is not same as Issue 25994 but it is quite related. Some kind of close() method and/or context manager support could help here as well. |
|||
msg258235 - (view) | Author: Martin Panter (martin.panter) * | Date: 2016-01-14 22:22 | |
Can you explain how it is different? The way I see it, both problems are about the scandir() iterator holding an open reference (file descriptor or handle) to a directory/folder, when the iterator was not exhausted, but the caller no longer needs it. |
|||
msg258236 - (view) | Author: Eryk Sun (eryksun) * | Date: 2016-01-14 22:24 | |
> That behavior on Windows is quite counterintuitive. It's counter-intuitive from a POSIX point of view, in which anonymous files are allowed. In contrast, Windows allows any existing reference to unset the delete disposition, so the name cannot be unlinked until all references are closed. |
|||
msg258248 - (view) | Author: Remy Roy (remyroy) | Date: 2016-01-14 23:13 | |
From my point of view, Issue 25994 is about the potential file descriptor/handle leaks and this issue is about being unable to perform some filesystem calls because of a hidden unclosed file descriptor/handle. I am not going to protest if you want to treat them as the same issue. |
|||
msg387642 - (view) | Author: Eryk Sun (eryksun) * | Date: 2021-02-24 23:12 | |
Issue 25994 added support for the context-manager protocol and close() method in 3.6. So it's at least much easier to ensure that the handle gets closed. The documentation of scandir() links to WinAPI FindFirstFile and FindNextFile, which at least mentions the "search handle". It's not made explicit that this encapsulates a handle for a kernel file object, nor are the operations (e.g. move, rename, delete) discussed that are allowed directly on the directory. Similarly, the directory stream that's returned by and used by POSIX opendir() and readdir() may or may not encapsulate a file descriptor. I don't think Python's documentation is the best place to discuss platform-specific implementation details in most cases. Exceptions should be made in some cases, but I don't think this is one of them because I can't even link to a document about the implementation details of FindNextFile. At a lower level I can link to documents about the NtQueryDirectoryFile[Ex] system call, but that's not much help in terms of officially documenting what FindNextFile does. Microsoft prefers to keep the Windows API details opaque, which gives them wiggle room. FYI, in Windows 10, deleting files and directories now tries a POSIX delete (if supported by the filesystem) that immediately unlinks the name as soon as the handle that's used to perform the delete is closed, such as the handle that's opened to implement DeleteFile (os.unlink) and RemoveDirectory (os.rmdir). NTFS supports this feature by moving the file/directory to a reserved "\$Extend\$Deleted" directory: >>> os.mkdir('spam') >>> h = win32file.CreateFile('spam', 0, 0, None, 3, 0x0200_0000, None) >>> print(win32file.GetFinalPathNameByHandle(h, 0)) \\?\C:\Temp\test\test\spam >>> os.rmdir('spam') >>> print(win32file.GetFinalPathNameByHandle(h, 0)) \\?\C:\$Extend\$Deleted\001000000000949A5E2FE5BB Of course, none of the above is documented for RemoveDirectory(). |
|||
msg387683 - (view) | Author: Steve Dower (steve.dower) * | Date: 2021-02-25 17:59 | |
> FYI, in Windows 10, deleting files and directories now tries a POSIX delete Yeah, FWIW, I haven't been able to get clear guidance on what I can/cannot publicly announce we've done in this space. But since you've found it I guess I can say sorry that I couldn't announce it more loudly! :) A number of our other issues should be able to be closed soon once the changes get out in the open. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:26 | admin | set | github: 70299 |
2021-02-25 17:59:56 | steve.dower | set | messages: + msg387683 |
2021-02-24 23:12:36 | eryksun | set | status: open -> closed resolution: third party messages: + msg387642 stage: resolved |
2016-01-14 23:13:07 | remyroy | set | messages: + msg258248 |
2016-01-14 22:24:52 | eryksun | set | messages: + msg258236 |
2016-01-14 22:22:09 | martin.panter | set | messages: + msg258235 |
2016-01-14 22:11:40 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka |
2016-01-14 21:54:30 | remyroy | set | messages: + msg258228 |
2016-01-14 21:46:39 | remyroy | set | messages: + msg258226 |
2016-01-14 21:35:09 | martin.panter | set | nosy:
+ martin.panter, docs@python messages: + msg258225 assignee: docs@python dependencies: + File descriptor leaks in os.scandir() components: + Documentation |
2016-01-14 20:50:33 | eryksun | set | nosy:
+ eryksun messages: + msg258219 |
2016-01-14 18:51:44 | benhoyt | set | nosy:
+ benhoyt |
2016-01-14 18:45:09 | remyroy | create |