This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: On Windows, os.stat() can fail if called while another process is creating or deleting the file
Type: behavior Stage: patch review
Components: Windows Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, itaisteinherz, paul.moore, steve.dower, tim.golden, zach.ware
Priority: normal Keywords: patch

Created on 2022-02-18 08:50 by Antony.Lee, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 31858 open itaisteinherz, 2022-03-13 20:16
Messages (10)
msg413468 - (view) Author: Antony Lee (Antony.Lee) * Date: 2022-02-18 08:50
In a first Python process, repeatedly create and delete a file:

from pathlib import Path
while True:
    Path("foo").touch(); Path("foo").unlink()

In another process, repeatedly check for the path's existence:

from pathlib import Path
while True: print(Path("foo").exists())

On Linux, the second process prints a random series of True and False.  On Windows, it quickly fails after a few dozen iterations (likely machine-dependent) with

PermissionError: [WinError 5] Access is denied: 'foo'

which is actually raised by the stat() call.

I would suggest that this is not really desirable behavior?
msg413476 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2022-02-18 11:00
Windows filesystems disallow new opens for a file that has its delete disposition set (i.e. the file is marked for deletion). For example, CreateFileW() fails with ERROR_ACCESS_DENIED (5) in this case. A file with its delete disposition set is still visibly linked in its parent directory. 

The trigger to unlink a file that's marked for deletion depends on the delete mode. In Windows 10, DeleteFileW() uses a POSIX delete if it's supported by the filesystem (e.g. NTFS). This delete mode unlinks the file as soon the file object that was used to set the delete disposition is closed. There's still a small window of time, however, in which attempts to open the file will fail with an access-denied error.

In os.stat(), if CreateFileW() fails with access denied, FindFirstFileW() is called to try to get the stat data from the file's parent directory. But in this case it's likely that the file has been unlinked from the directory by the time FindFirstFileW() is called.

The original error code from CreateFileW() gets restored if FindFirstFileW() fails. This is generally the right thing to do. However, if FindFirstFileW() fails with ERROR_FILE_NOT_FOUND (2) or ERROR_PATH_NOT_FOUND (3), then I suggest that the previous error code should not be restored.

For example:

        switch (error) {
        case ERROR_ACCESS_DENIED:     /* Cannot sync or read attributes. */
        case ERROR_SHARING_VIOLATION: /* It's a paging file. */
            /* Try reading the parent directory. */
            if (!attributes_from_dir(path, &fileInfo, &tagInfo.ReparseTag)) {
                /* Cannot read the parent directory. */
                DWORD dir_error = GetLastError();
                if (dir_error != ERROR_FILE_NOT_FOUND &&
                      dir_error != ERROR_PATH_NOT_FOUND) {
                    SetLastError(error);
                }
                return -1;
            }
msg414820 - (view) Author: Itai Steinherz (itaisteinherz) * Date: 2022-03-09 23:31
I'd like to work on this, however I'm not sure how this could be unit-tested. Any ideas?
msg414927 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2022-03-11 18:12
Itai, you can add a test to Win32NtTests in Lib/test/test_os.py. Maybe spawn a child process that creates and unlinks a file in a loop. In the parent process execute a loop that tries to stat the file and ignores errors when the file or path isn't found. For example:

    @support.requires_subprocess()
    def test_stat_unlink_race(self):
        # bpo-46785: the implementation of os.stat() falls back on reading
        # the parent directory if CreateFileW() fails with a permission
        # error. If reading the parent directory fails because the file or
        # directory is subsequently unlinked or because the volume or
        # share is no longer available, then the original permission error
        # should not be restored.
        fname = os.path.join(os.environ['TEMP'], os_helper.TESTFN + '_46785')
        self.addCleanup(os_helper.unlink, fname)
        command = '''if 1:
            import os
            import sys
            fname = sys.argv[1]
            while True:
                try:
                    with open(fname, "w") as f:
                        pass
                except OSError:
                    pass
                try:
                    os.remove(fname)
                except OSError:
                    pass
        '''
        ignored_errors = (
            2,  # ERROR_FILE_NOT_FOUND
            3,  # ERROR_PATH_NOT_FOUND
            21, # ERROR_NOT_READY
            67, # ERROR_BAD_NET_NAME
        )
        deadline = time.time() + 5
        p = subprocess.Popen([sys.executable, '-c', command, fname])
        try:
            while time.time() < deadline:
                try:
                    os.stat(fname)
                except OSError as e:
                    if e.winerror not in ignored_errors:
                        raise
        finally:
            p.terminate()

As the above test shows, I think the error should also be kept if attributes_from_dir() fails with ERROR_NOT_READY or ERROR_BAD_NET_NAME. For example:

        switch (error) {
        case ERROR_ACCESS_DENIED:     /* Cannot sync or read attributes. */
        case ERROR_SHARING_VIOLATION: /* It's a paging file. */
            /* Try reading the parent directory. */
            if (!attributes_from_dir(path, &fileInfo, &tagInfo.ReparseTag)) {
                /* Cannot read the parent directory. */
                switch (GetLastError()) {
                // keep these error codes
                case ERROR_FILE_NOT_FOUND:
                case ERROR_PATH_NOT_FOUND:
                case ERROR_NOT_READY:
                case ERROR_BAD_NET_NAME:
                    break;
                // restore the error from CreateFileW()
                default:
                    SetLastError(error);
                }
                return -1;
            }
msg415088 - (view) Author: Itai Steinherz (itaisteinherz) * Date: 2022-03-13 19:49
Thanks for the comprehensive reply, Eryk!

I have a few questions regarding your suggestion:

1. Why catch ERROR_NOT_READY and ERROR_BAD_NET_NAME as well? How do you know that FindFirstFileW() may return them?
2. Why can't the filename of the "foo"-like file in the test be simply os_helper.TESTFN, as done in some other tests?
3. I noticed some code snippets used in tests are wrapped in a seemingly-redundant `if 1: ...` statement. Why is that?
msg415096 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2022-03-13 20:39
> Why catch ERROR_NOT_READY and ERROR_BAD_NET_NAME as well?

When os.stat() falls back on FindFirstFileW(), an error that means the file doesn't exist should be kept. ERROR_BAD_NET_NAME is an obvious error to keep because it's already mapped to ENOENT (i.e. file not found). This error typically means that share was removed or the network went down. ERROR_NOT_READY typically means that the media was removed from a volume (e.g. ejected disk). pathlib.Path.exists() returns False for ERROR_NOT_READY.

> Why can't the filename of the "foo"-like file in the test be 
> simply os_helper.TESTFN, as done in some other tests?

I suppose the current working directory will be fine. I was looking to keep the test on a NTFS filesystem, with known behavior, but there's no hard guarantee that the user's temp directory is on the system volume.

> I noticed some code snippets used in tests are wrapped in a 
> seemingly-redundant `if 1: ...` statement. Why is that?

The `if 1` test is logically redundant but convenient for using a multiline string that has an arbitrary indentation level. For example:

    >>> exec('''
    ...         print(42)
    ... ''')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<string>", line 2
        print(42)
    IndentationError: unexpected indent

    >>> exec('''if 1:
    ...         print(42)
    ... ''')
    42
msg415102 - (view) Author: Itai Steinherz (itaisteinherz) * Date: 2022-03-13 20:53
Interesting, thanks again :)
msg415167 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2022-03-14 17:07
>> Why can't the filename of the "foo"-like file in the test be
>> simply os_helper.TESTFN, as done in some other tests?
> 
> I suppose the current working directory will be fine. I was looking to keep the test on a NTFS filesystem, with known behavior, but there's no hard guarantee that the user's temp directory is on the system volume.

All tests should use the current working directory, or the test helper 
for getting other directories. *Do not look up other directories*, 
because it prevents test runners from ensuring that tests run in valid 
locations (and skips implicit tests for directories with spaces/Unicode 
characters/etc.)
msg415176 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2022-03-14 18:05
I was following the pattern of StatAttributeTests.test_access_denied(), which uses the current user's temp directory to get a filesystem that supports security. 

It would probably be better to skip tests if the filesystem of the current working directory doesn't support the test, e.g. if it needs NTFS or needs support for security, hard links, or reparse points. For example, Win32JunctionTests should be skipped if reparse points aren't supported. The os_helper module could implement a function that calls GetVolumeInformationW() to get the filesystem name (e.g. "Ntfs") and flags (e.g. FILE_PERSISTENT_ACLS, FILE_SUPPORTS_HARD_LINKS, FILE_SUPPORTS_REPARSE_POINTS).
msg415177 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2022-03-14 18:14
> It would probably be better to skip tests if the filesystem of the current working directory doesn't support the test, 

Yes, this would be good. Then whoever is configuring the test runner can 
move where tests are run to make sure it is supported. There are command 
line options specifically for this, that also correctly handle 
multiprocessing.

Tests that bypass the CWD make this unfixable by the runner, which is 
why they should just use CWD.
History
Date User Action Args
2022-04-11 14:59:56adminsetgithub: 90941
2022-03-14 18:14:11steve.dowersetmessages: + msg415177
2022-03-14 18:05:59eryksunsetmessages: + msg415176
2022-03-14 17:07:07steve.dowersetmessages: + msg415167
2022-03-13 21:05:20Antony.Leesetnosy: - Antony.Lee
2022-03-13 20:53:20itaisteinherzsetmessages: + msg415102
2022-03-13 20:39:28eryksunsetmessages: + msg415096
2022-03-13 20:16:59itaisteinherzsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request29956
2022-03-13 19:49:45itaisteinherzsetmessages: + msg415088
2022-03-11 18:12:37eryksunsetmessages: + msg414927
2022-03-09 23:31:25itaisteinherzsetnosy: + itaisteinherz
messages: + msg414820
2022-02-18 11:00:44eryksunsetversions: + Python 3.9, Python 3.11
nosy: + eryksun

messages: + msg413476

type: behavior
stage: needs patch
2022-02-18 08:50:37Antony.Leecreate