On Windows, os.stat() can fail if called while another process is creating or deleting the file #90941

anntzer · 2022-02-18T08:50:37Z

BPO	46785
Nosy	@pfmoore, @tjguk, @zware, @eryksun, @zooba, @itaisteinherz
PRs	bpo-46785: Fix race condition between `os.stat()` and unlink #31858

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2022-02-18.08:50:37.444>
labels = ['3.10', 'type-bug', '3.9', 'OS-windows', '3.11']
title = 'On Windows, os.stat() can fail if called while another process is creating or deleting the file'
updated_at = <Date 2022-03-14.18:14:11.385>
user = 'https://github.com/anntzer'

bugs.python.org fields:

activity = <Date 2022-03-14.18:14:11.385>
actor = 'steve.dower'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Windows']
creation = <Date 2022-02-18.08:50:37.444>
creator = 'Antony.Lee'
dependencies = []
files = []
hgrepos = []
issue_num = 46785
keywords = ['patch']
message_count = 10.0
messages = ['413468', '413476', '414820', '414927', '415088', '415096', '415102', '415167', '415176', '415177']
nosy_count = 6.0
nosy_names = ['paul.moore', 'tim.golden', 'zach.ware', 'eryksun', 'steve.dower', 'itaisteinherz']
pr_nums = ['31858']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue46785'
versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

The text was updated successfully, but these errors were encountered:

anntzer · 2022-02-18T08:50:37Z

In a first Python process, repeatedly create and delete a file:

from pathlib import Path
while True:
    Path("foo").touch(); Path("foo").unlink()

In another process, repeatedly check for the path's existence:

from pathlib import Path
while True: print(Path("foo").exists())

On Linux, the second process prints a random series of True and False. On Windows, it quickly fails after a few dozen iterations (likely machine-dependent) with

PermissionError: [WinError 5] Access is denied: 'foo'

which is actually raised by the stat() call.

I would suggest that this is not really desirable behavior?

eryksun · 2022-02-18T11:00:44Z

Windows filesystems disallow new opens for a file that has its delete disposition set (i.e. the file is marked for deletion). For example, CreateFileW() fails with ERROR_ACCESS_DENIED (5) in this case. A file with its delete disposition set is still visibly linked in its parent directory.

The trigger to unlink a file that's marked for deletion depends on the delete mode. In Windows 10, DeleteFileW() uses a POSIX delete if it's supported by the filesystem (e.g. NTFS). This delete mode unlinks the file as soon the file object that was used to set the delete disposition is closed. There's still a small window of time, however, in which attempts to open the file will fail with an access-denied error.

In os.stat(), if CreateFileW() fails with access denied, FindFirstFileW() is called to try to get the stat data from the file's parent directory. But in this case it's likely that the file has been unlinked from the directory by the time FindFirstFileW() is called.

The original error code from CreateFileW() gets restored if FindFirstFileW() fails. This is generally the right thing to do. However, if FindFirstFileW() fails with ERROR_FILE_NOT_FOUND (2) or ERROR_PATH_NOT_FOUND (3), then I suggest that the previous error code should not be restored.

For example:

        switch (error) {
        case ERROR_ACCESS_DENIED:     /* Cannot sync or read attributes. */
        case ERROR_SHARING_VIOLATION: /* It's a paging file. */
            /* Try reading the parent directory. */
            if (!attributes_from_dir(path, &fileInfo, &tagInfo.ReparseTag)) {
                /* Cannot read the parent directory. */
                DWORD dir_error = GetLastError();
                if (dir_error != ERROR_FILE_NOT_FOUND &&
                      dir_error != ERROR_PATH_NOT_FOUND) {
                    SetLastError(error);
                }
                return -1;
            }

itaisteinherz · 2022-03-09T23:31:25Z

I'd like to work on this, however I'm not sure how this could be unit-tested. Any ideas?

eryksun · 2022-03-11T18:12:37Z

Itai, you can add a test to Win32NtTests in Lib/test/test_os.py. Maybe spawn a child process that creates and unlinks a file in a loop. In the parent process execute a loop that tries to stat the file and ignores errors when the file or path isn't found. For example:

    @support.requires_subprocess()
    def test_stat_unlink_race(self):
        # bpo-46785: the implementation of os.stat() falls back on reading
        # the parent directory if CreateFileW() fails with a permission
        # error. If reading the parent directory fails because the file or
        # directory is subsequently unlinked or because the volume or
        # share is no longer available, then the original permission error
        # should not be restored.
        fname = os.path.join(os.environ['TEMP'], os_helper.TESTFN + '_46785')
        self.addCleanup(os_helper.unlink, fname)
        command = '''if 1:
            import os
            import sys
            fname = sys.argv[1]
            while True:
                try:
                    with open(fname, "w") as f:
                        pass
                except OSError:
                    pass
                try:
                    os.remove(fname)
                except OSError:
                    pass
        '''
        ignored_errors = (
            2,  # ERROR_FILE_NOT_FOUND
            3,  # ERROR_PATH_NOT_FOUND
            21, # ERROR_NOT_READY
            67, # ERROR_BAD_NET_NAME
        )
        deadline = time.time() + 5
        p = subprocess.Popen([sys.executable, '-c', command, fname])
        try:
            while time.time() < deadline:
                try:
                    os.stat(fname)
                except OSError as e:
                    if e.winerror not in ignored_errors:
                        raise
        finally:
            p.terminate()

As the above test shows, I think the error should also be kept if attributes_from_dir() fails with ERROR_NOT_READY or ERROR_BAD_NET_NAME. For example:

        switch (error) {
        case ERROR_ACCESS_DENIED:     /* Cannot sync or read attributes. */
        case ERROR_SHARING_VIOLATION: /* It's a paging file. */
            /* Try reading the parent directory. */
            if (!attributes_from_dir(path, &fileInfo, &tagInfo.ReparseTag)) {
                /* Cannot read the parent directory. */
                switch (GetLastError()) {
                // keep these error codes
                case ERROR_FILE_NOT_FOUND:
                case ERROR_PATH_NOT_FOUND:
                case ERROR_NOT_READY:
                case ERROR_BAD_NET_NAME:
                    break;
                // restore the error from CreateFileW()
                default:
                    SetLastError(error);
                }
                return -1;
            }

itaisteinherz · 2022-03-13T19:49:45Z

Thanks for the comprehensive reply, Eryk!

I have a few questions regarding your suggestion:

Why catch ERROR_NOT_READY and ERROR_BAD_NET_NAME as well? How do you know that FindFirstFileW() may return them?
Why can't the filename of the "foo"-like file in the test be simply os_helper.TESTFN, as done in some other tests?
I noticed some code snippets used in tests are wrapped in a seemingly-redundant if 1: ... statement. Why is that?

eryksun · 2022-03-13T20:39:28Z

Why catch ERROR_NOT_READY and ERROR_BAD_NET_NAME as well?

When os.stat() falls back on FindFirstFileW(), an error that means the file doesn't exist should be kept. ERROR_BAD_NET_NAME is an obvious error to keep because it's already mapped to ENOENT (i.e. file not found). This error typically means that share was removed or the network went down. ERROR_NOT_READY typically means that the media was removed from a volume (e.g. ejected disk). pathlib.Path.exists() returns False for ERROR_NOT_READY.

Why can't the filename of the "foo"-like file in the test be
simply os_helper.TESTFN, as done in some other tests?

I suppose the current working directory will be fine. I was looking to keep the test on a NTFS filesystem, with known behavior, but there's no hard guarantee that the user's temp directory is on the system volume.

I noticed some code snippets used in tests are wrapped in a
seemingly-redundant if 1: ... statement. Why is that?

The if 1 test is logically redundant but convenient for using a multiline string that has an arbitrary indentation level. For example:

    >>> exec('''
    ...         print(42)
    ... ''')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<string>", line 2
        print(42)
    IndentationError: unexpected indent

    >>> exec('''if 1:
    ...         print(42)
    ... ''')
    42

itaisteinherz · 2022-03-13T20:53:21Z

Interesting, thanks again :)

zooba · 2022-03-14T17:07:07Z

> Why can't the filename of the "foo"-like file in the test be
> simply os_helper.TESTFN, as done in some other tests?

I suppose the current working directory will be fine. I was looking to keep the test on a NTFS filesystem, with known behavior, but there's no hard guarantee that the user's temp directory is on the system volume.

All tests should use the current working directory, or the test helper
for getting other directories. *Do not look up other directories*,
because it prevents test runners from ensuring that tests run in valid
locations (and skips implicit tests for directories with spaces/Unicode
characters/etc.)

eryksun · 2022-03-14T18:05:59Z

I was following the pattern of StatAttributeTests.test_access_denied(), which uses the current user's temp directory to get a filesystem that supports security.

It would probably be better to skip tests if the filesystem of the current working directory doesn't support the test, e.g. if it needs NTFS or needs support for security, hard links, or reparse points. For example, Win32JunctionTests should be skipped if reparse points aren't supported. The os_helper module could implement a function that calls GetVolumeInformationW() to get the filesystem name (e.g. "Ntfs") and flags (e.g. FILE_PERSISTENT_ACLS, FILE_SUPPORTS_HARD_LINKS, FILE_SUPPORTS_REPARSE_POINTS).

zooba · 2022-03-14T18:14:11Z

It would probably be better to skip tests if the filesystem of the current working directory doesn't support the test,

Yes, this would be good. Then whoever is configuring the test runner can
move where tests are run to make sure it is supported. There are command
line options specifically for this, that also correctly handle
multiprocessing.

Tests that bypass the CWD make this unfixable by the runner, which is
why they should just use CWD.

anntzer mannequin added 3.10 only security fixes OS-windows labels Feb 18, 2022

eryksun added 3.9 only security fixes 3.11 only security fixes type-bug An unexpected behavior, bug, or error labels Feb 18, 2022

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On Windows, os.stat() can fail if called while another process is creating or deleting the file #90941

On Windows, os.stat() can fail if called while another process is creating or deleting the file #90941

anntzer mannequin commented Feb 18, 2022

anntzer mannequin commented Feb 18, 2022

eryksun commented Feb 18, 2022

itaisteinherz mannequin commented Mar 9, 2022

eryksun commented Mar 11, 2022

itaisteinherz mannequin commented Mar 13, 2022

eryksun commented Mar 13, 2022

itaisteinherz mannequin commented Mar 13, 2022

zooba commented Mar 14, 2022

eryksun commented Mar 14, 2022

zooba commented Mar 14, 2022

On Windows, os.stat() can fail if called while another process is creating or deleting the file #90941

On Windows, os.stat() can fail if called while another process is creating or deleting the file #90941

Comments

anntzer mannequin commented Feb 18, 2022

anntzer mannequin commented Feb 18, 2022

eryksun commented Feb 18, 2022

itaisteinherz mannequin commented Mar 9, 2022

eryksun commented Mar 11, 2022

itaisteinherz mannequin commented Mar 13, 2022

eryksun commented Mar 13, 2022

itaisteinherz mannequin commented Mar 13, 2022

zooba commented Mar 14, 2022

eryksun commented Mar 14, 2022

zooba commented Mar 14, 2022