classification
Title: Support POSIX atomicity guarantee of O_APPEND on Windows
Type: enhancement Stage: patch review
Components: IO, Windows Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, izbyshev, paul.moore, steve.dower, tim.golden, vstinner, zach.ware
Priority: normal Keywords: patch

Created on 2020-12-09 03:12 by izbyshev, last changed 2021-01-26 23:48 by eryksun.

Files
File name Uploaded Description Edit
test.py izbyshev, 2020-12-09 03:12
Pull Requests
URL Status Linked Edit
PR 23712 open izbyshev, 2020-12-09 03:16
Messages (11)
msg382784 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2020-12-09 03:12
On POSIX-conforming systems, O_APPEND flag for open() must ensure that no intervening file modification occurs between changing the file offset and the write operation[1]. In effect, two processes that independently opened the same file with O_APPEND can't overwrite each other's data. On Windows, however, the Microsoft C runtime implements O_APPEND as an lseek(fd, 0, SEEK_END) followed by write(), which obviously doesn't provide the above guarantee. This affects both os.open() and the builtin open() Python functions, which rely on _wopen() from MSVCRT. A demo is attached.

While POSIX O_APPEND doesn't guarantee the absence of partial writes, the guarantee of non-overlapping writes alone is still useful in cases such as debug logging from multiple processes without file locking or other synchronization. Moreover, for local filesystems, partial writes don't really occur in practice (barring conditions such as ENOSPC or EIO).

Windows offers two ways to achieve non-overlapping appends:

1. WriteFile()[2] with OVERLAPPED structure with Offset and OffsetHigh set to -1. This is essentially per-write O_APPEND.

2. Using a file handle with FILE_APPEND_DATA access right but without FILE_WRITE_DATA access right.

While (1) seems easy to add to FileIO, there are some problems:

* os.write(fd) can't use it without caller's help because it has no way to know that the fd was opened with O_APPEND (there is no fcntl() in MSVCRT)

* write() from MSVCRT (currently used by FileIO) performs some additional remapping of error codes and checks after it calls WriteFile(), so we'd have to emulate that behavior or risk breaking compatibility.

I considered to go for (2) by reimplementing _wopen() via CreateFile(), which would also allow us to solve a long-standing issue of missing FILE_SHARE_DELETE on file handles, but hit several problems:

* the most serious one is rather silly: we need to honor the current umask to possibly create a read-only file, but there is no way to query it without changing it, which is not thread-safe. Well, actually, I did discover a way: _umask_s(), when called with an invalid mask, returns both EINVAL error and the current umask. But this behavior directly contradicts MSDN, which claims that _umask_s() doesn't modify its second argument on failure[3]. So I'm not willing to rely on this until Microsoft fixes their docs.

* os module exposes some MSVCRT-specific flags for use with os.open() (like O_TEMPORARY), which a reimplementation would have to support. It seems easy in most cases, but there is O_TEXT, which enables some obscure legacy behavior in MSVCRT such as removal of a trailing byte 26 (Ctrl-Z) when a file is opened with O_RDWR. More generally, it's unclear to me whether os.open() is explicitly intended to be a gateway to MSVCRT and thus support all current and future flags or is just expected to work similarly to MSVCRT in common cases.

So in the end I decided to let _wopen() create the initial fd as usual, but then fix it up via DuplicateHandle() -- see the PR.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html
[2] https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile
[3] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/umask-s?view=msvc-160
msg385258 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2021-01-19 13:07
Could anybody provide their thoughts on this RFE? Thanks.
msg385314 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-01-20 07:39
> os.write(fd) can't ... know that the fd was opened with O_APPEND

It's possible to query the granted access of a kernel handle via NtQueryObject: ObjectBasicInformation -- not that I'm suggesting to use it.

> the most serious one is rather silly: we need to honor the current 
> umask to possibly create a read-only file, 

It pains me to see umask get in the way of implementing open() directly via CreateFileW, which we need in order to support delete-access sharing. 

Python could implement its own umask value in Windows. os.umask() would set the C umask value as well, but only for the sake of consistency with C extensions and embedding.

> os module exposes some MSVCRT-specific flags for use with os.open() 
> (like O_TEMPORARY), which a reimplementation would have to support. 

Additionally, ucrt has an undocumented O_OBTAIN_DIR flag. It opens with backup semantics, which would be more obvious if aliased as O_BACKUP_SEMANTICS. This allows an open to take advantage of SeBackupPrivilege and SeRestorePrivilege if they're enabled, to get read or write access regardless of the file security.

Open attribute flags could also be supported, such as O_ATTR_HIDDEN and O_ATTR_SYSTEM. These are needed because a hidden or system file is required to remain as such when it's overwritten, else CreateFileW fails.

> It seems easy in most cases, but there is O_TEXT, 

Retaining O_TEXT in 3.x was probably an accident. Text mode should use a TextIOWrapper instance, and I doubt that there's a serious need to support Ctrl+Z as EOF. 

If it's important enough, msvcrt.open() and msvcrt.O_TEXT could be provided.
msg385437 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2021-01-21 17:46
> It's possible to query the granted access of a kernel handle via NtQueryObject: ObjectBasicInformation

Ah, thanks for the info. But it wouldn't help for option (1) that I had in mind because open() and os.open() currently set only msvcrt-level O_APPEND.

It probably could be used in my current PR instead of just assuming the default access rights for file handles, but I'm not sure whether it makes sense: can a new handle have non-default access rights? Or can the default change at this point of Windows history?

> Python could implement its own umask value in Windows. os.umask() would set the C umask value as well, but only for the sake of consistency with C extensions and embedding.

Would it be a long shot to ask MS to add the needed functionality to MSVCRT instead? Or at least to declare the bug with _umask_s() that I described a feature. Maybe Steve Dower could help.

> Open attribute flags could also be supported, such as O_ATTR_HIDDEN and O_ATTR_SYSTEM. These are needed because a hidden or system file is required to remain as such when it's overwritten, else CreateFileW fails.

I didn't know that, thanks. Indeed, a simple open(path, 'w') fails on a hidden file with "Permission denied".

> If it's important enough, msvcrt.open() and msvcrt.O_TEXT could be provided.

Yes, I'd be glad to move support for more obscure MSVCRT flags to msvcrt.open() -- the less MSVCRT details we leak in os.open(), the more freedom we have to implement it via proper Win32 APIs.

===

Anyway, even if the blockers for implementing open()/os.open() via CreateFile() are solved, my current PR doesn't seem to conflict with such an implementation (it could just be replaced in the future).

Currently, the only way to achieve atomic append in Python on Windows that I know is to use a custom opener that would call CreateFile() with the right arguments via ctypes/pywin32/etc.
msg385454 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-01-21 22:23
> can a new handle have non-default access rights? Or can the 
> default change at this point of Windows history?

I don't know what you mean by default access rights.

C open() requests generic access rights, which map to the standard and file-specific rights in the File type's generic mapping. If all of the requested access rights aren't granted, then the open fails with an access denied error. For example, if FILE_WRITE_DATA isn't granted, then open() can't open for appending. A direct CreateFileW() call can remove FILE_WRITE_DATA from the desired access.

DuplicateHandle() can always request the same or less access than the source handle. For some object types, it can perform an access check to get more access, but not for a File handle.

> Currently, the only way to achieve atomic append in Python on 
> Windows that I know is to use a custom opener that would call
> CreateFile() with the right arguments via ctypes/pywin32/etc.

An opener could also work like your PR via os.open(), msvcrt.get_osfhandle(), _winapi.GetFileType(), _winapi.DuplicateHandle(), os.close(), and msvcrt.open_osfhandle().
msg385456 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2021-01-21 22:56
> I don't know what you mean by default access rights.

I meant the access rights of the handle created by _wopen(). In my PR I basically assume that _wopen() uses GENERIC_READ/GENERIC_WRITE access rights, but _wopen() doesn't have a contractual obligation to do exactly that AFAIU. For example, if it got some extra access rights, then my code would "drop" them while switching FILE_WRITE_DATA off.

> For example, if FILE_WRITE_DATA isn't granted, then open() can't open for appending. A direct CreateFileW() call can remove FILE_WRITE_DATA from the desired access.

Indeed, I haven't thought about it. Are you aware of a common scenario when a regular file allows appending but not writing?

But, at least, this is not a regression: currently open()/os.open() can't open such files for appending too.

> An opener could also work like your PR via os.open(), msvcrt.get_osfhandle(), _winapi.GetFileType(), _winapi.DuplicateHandle(), os.close(), and msvcrt.open_osfhandle().

True, but it still falls into "etc" category of "ctypes/pywin32/etc" :) Certainly doable, but it seems better to have consistent O_APPEND behavior across platforms out-of-the-box.
msg385477 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-01-22 04:53
FYI, here are the access rights applicable to files, including their membership in generic (R)ead, (W)rite, and e(X)execute access:

    0x0100_0000 --- ACCESS_SYSTEM_SECURITY
    0x0010_0000 RWX SYNCHRONIZE
    0x0008_0000 --- WRITE_OWNER
    0x0004_0000 --- WRITE_DAC
    0x0002_0000 RWX READ_CONTROL
    0x0001_0000 --- DELETE

    0x0000_0001 R-- FILE_READ_DATA
    0x0000_0002 -W- FILE_WRITE_DATA
    0x0000_0004 -W- FILE_APPEND_DATA
    0x0000_0008 R-- FILE_READ_EA
    0x0000_0010 -W- FILE_WRITE_EA
    0x0000_0020 --X FILE_EXECUTE
    0x0000_0040 --- FILE_DELETE_CHILD
    0x0000_0080 R-X FILE_READ_ATTRIBUTES
    0x0000_0100 -W- FILE_WRITE_ATTRIBUTES

> _wopen() uses GENERIC_READ/GENERIC_WRITE access rights, but 
> _wopen() doesn't have a contractual obligation to do exactly 
> that AFAIU. For example, if it got some extra access rights, 
> then my code would "drop" them while switching FILE_WRITE_DATA off.

I overlooked a case that's a complication. For O_TEMPORARY, the open uses FILE_FLAG_DELETE_ON_CLOSE; adds DELETE to the requested access; and adds FILE_SHARE_DELETE to the share mode. 

With delete-on-close, a file gets marked as deleted as soon as the last handle for the kernel File is closed. This is the classic Windows 'deleted' state in which the filename remains linked in the directory but inaccessible for any new access (as opposed to the immediate POSIX style delete that DeleteFileW attempts in Windows 10). Existing opens (i.e. kernel File objects from CreateFileW) are still valid, and, if they have delete access, they can even be used to undelete the file via SetFileInformationByHandle: FileDispositionInfo. 

_Py_wopen_noraise() can easily keep the required DELETE access. The complication is that you have to be careful not to close the original file descriptor until after you've successfully created the duplicate file descriptor. If it fails, you have to return the original file descriptor from _wopen(). If you close all handles for the kernel File and fail the call, the side effect of deleting the file is unacceptable. 

The C runtime itself isn't careful about using O_TEMPORARY in text mode, given how it closes the file if truncation fails or has to open the file twice in O_WRONLY mode in order to read the BOM. But at least it's reliable in binary mode. The file descriptor is pre-allocated with _alloc_osfhnd(), so it won't fail after CreateFileW() is called.

> Are you aware of a common scenario when a regular file allows 
> appending but not writing?

It's certainly not common for regular files. (It's more common for directories, for which append access corresponds to the right to create a subdirectory.) Maybe users are only allowed to append to a given log file. But I think most scenarios will be accidental. 

For example, say an admin wants to deny write access to standard users. Denying simple or generic write access is wrong, since doing so also denies rights required for reading, i.e. synchronize and read-control. So the admin runs the following command to deny only write-data access: `icacls filename /deny *BU:(WD)`. This forgets about append-data (AD) access. It's still sufficient for most applications, which request generic-write access.
msg385506 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2021-01-22 18:51
> FYI, here are the access rights applicable to files

Thanks, I checked that mapping in headers when I was writing _Py_wopen_noraise() as well. But I've found a catch via ProcessHacker: CreateFile() with GENERIC_WRITE (or FILE_GENERIC_WRITE) additionally grants FILE_READ_ATTRIBUTES for some reason. This is why I add FILE_READ_ATTRIBUTES to FILE_GENERIC_WRITE for DuplicateHandle() -- otherwise, the duplicated handle didn't have the same rights in my testing. So basically I have to deal with (a) _wopen() not guaranteeing contractually any specific access rights and (b) CreateFile() not respecting the specified access rights exactly. This is where my distrust and my original question about "default" access rights come from.

> I overlooked a case that's a complication. For O_TEMPORARY, the open uses FILE_FLAG_DELETE_ON_CLOSE; adds DELETE to the requested access; and adds FILE_SHARE_DELETE to the share mode. 
> _Py_wopen_noraise() can easily keep the required DELETE access. The complication is that you have to be careful not to close the original file descriptor until after you've successfully created the duplicate file descriptor. If it fails, you have to return the original file descriptor from _wopen(). If you close all handles for the kernel File and fail the call, the side effect of deleting the file is unacceptable. 

Good catch! But now I realize that the problem with undoing the side effect applies to O_CREAT and O_TRUNC too: we can create and/or truncate the file, but then fail. Even if we could assume that DuplicateHandle() can't fail, _open_osfhandle() can still fail with EMFILE. And since there is no public function in MSVCRT to replace an underlying handle of an existing fd, we can't preallocate an fd to avoid this. There would be no problem if we could just reduce access rights of an existing handle, but it seems there is no such API.

I don't like the idea of silently dropping the atomic append guarantee in case of a failure, so I'm not sure how to proceed with the current approach.

Moreover, the same issue would apply even in case of direct implementation of os.open()/open() via CreateFile() because we still need to wrap the handle with an fd, and this may fail with EMFILE. For O_CREAT/O_TRUNC, it seems like it could be reasonably avoided:

* Truncation can simply be deferred until we have the fd and then performed manually.

* To undo the file creation, we could use GetLastError() to learn whether CreateFile() with OPEN_ALWAYS actually created the file, and then delete it on failure (still non-atomic, and deletion can fail, but probably better than nothing).

But I still don't know how to deal with O_TEMPORARY, unless there is a way to unset FILE_DELETE_ON_CLOSE on a handle.

As an aside, it's also very surprising to me that O_TEMPORARY is allowed for existing files at all. If not for that, there would be no issue on failure apart from non-atomicity.

Maybe I should forgo the idea of supporting O_APPEND for os.open(), and instead just support it in FileIO via WriteFile()-with-OVERLAPPED approach...
msg385523 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-01-23 04:47
> I've found a catch via ProcessHacker: CreateFile() with 
> GENERIC_WRITE (or FILE_GENERIC_WRITE) additionally grants 
> FILE_READ_ATTRIBUTES for some reason. 

CreateFileW always requests at least SYNCHRONIZE and FILE_READ_ATTRIBUTES access.

The I/O manager requires synchronize access if a file is opened in synchronous mode. CreateFileW goes a step further. It always requests synchronize access, even with asynchronous mode (overlapped). The File object gest signaled when an I/O request completes, but it's not very useful in the context of overlapping requests.

Requesting read-attributes access supports API functions that query certain file information. Here are some of the more common queries that require read-attributes access:

    FileBasicInformation (GetFileInformationByHandleEx, GetFileTime)
    FileAllInformation (GetFileInformationByHandle)
    FileAttributeTagInformation (GetFileInformationByHandleEx)

Thus os.fstat(fd) can succeed even if the file is opened in O_WRONLY mode.

CreateFileW also implicitly requests DELETE access if FILE_FLAG_DELETE_ON_CLOSE is used, instead of letting the call fail with an invalid-parameter error if delete access isn't requested. This behavior isn't documented.

> undoing the side effect applies to O_CREAT and O_TRUNC too: we can create and/or 
> truncate the file, but then fail. 

I think truncation via TRUNCATE_EXISTING (O_TRUNC, with O_WRONLY or O_RDWR) or overwriting with CREATE_ALWAYS (O_CREAT | O_TRUNC) is at least tolerable because the caller doesn't care about the existing data. When overwriting, the caller also wants to remove any alternate data streams and extended attributes in the file. Nothing important is lost. Also, since both cases retain the original file's security descriptor, at least failure after truncation or overwriting isn't a security hole.

Unless we require CREATE_NEW (O_CREAT | O_EXCL) whenever O_TEMPORARY is used (i.e. as the tempfile module uses it), there is a potential for an existing file to be deleted if all handles are closed on failure, as discussed previously. This is unacceptable not only because of potential unrecoverable data loss, but also because the security descriptor is lost. 

With OPEN_ALWAYS (O_CREAT), CREATE_ALWAYS or CREATE_NEW, there's the chance of leaving behind a new empty file or alternate data stream on failure, which is a problem, but at least nothing is lost.

> _open_osfhandle() can still fail with EMFILE. 

The CRT supports 8192 open file descriptors (128 arrays of 64 fds), so failing with EMFILE should be rare, in extreme cases. There's also a remote possibility of memory corruption that causes __acrt_lowio_set_os_handle() to fail with EBADF because the fd value is negative, or its handle value isn't the default INVALID_HANDLE_VALUE, or the CRT _nhandle count is corrupt. These aren't practical concerns, just as DuplicateHandle() failing isn't a practical concern, but failure should be handled conservatively.

> the same issue would apply even in case of direct implementation of 
> os.open()/open() via CreateFile() 

Migrating to CreateFileW() might need to be shelved until Python uses native OS File handles instead of CRT file descriptors. The remaining reliance on the CRT low I/O layer ties our hands for now.

> Truncation can simply be deferred until we have the fd and then performed manually.

What if it fails after overwriting an existing file? Manually overwriting only after getting the new fd is complicated. To match CREATE_ALWAYS (O_CREAT | O_TRUNC), before overwriting it would have to query the existing file attributes and fail the call if FILE_ATTRIBUTE_HIDDEN or FILE_ATTRIBUTE_SYSTEM is set. If the file itself has to be overwritten (i.e. the default, anonymous data stream), as opposed to a named data stream, it would have to delete all named data streams and extended attributes in the file. Normally that's all implemented atomically in the filesystem. 

In contrast, TRUNCATE_EXISTING (O_TRUNC) is simple to emulate, since CreateFileW implents it non-atomically with a subsequent NtSetInformationFile: FileAllocationInformation system call. 

> But I still don't know how to deal with O_TEMPORARY, unless there is a 
> way to unset FILE_DELETE_ON_CLOSE on a handle.

For now, that's possible with NTFS and the Windows API in all supported versions of Windows by using a second kernel File with DELETE access, which is opened before the last handle to the first kernel File is closed. After you close the first open, use the second one to call SetFileInformation: FileDispositionInfo to undelete the file. That said, if NTFS changes the default for delete-on-close to use a POSIX-style delete (immediate unlink), it won't be possible to 'undelete' the file.

Windows 10 supports additional flags with FileDispositionInfoEx (21), or NTAPI FileDispositionInformationEx [1]. This provides a better way to disable or modify the delete-on-close state per kernel File object, if the filesystem supports it. If FILE_DISPOSITION_ON_CLOSE (8) is set with FILE_DISPOSITION_DO_NOT_DELETE (0), the on-close disposition will be disabled. It is not possible, as far as I know, to enable it again. For example:

    >>> fd = os.open('spam.txt', os.O_TEMPORARY|os.O_CREAT)
    >>> h = msvcrt.get_osfhandle(fd)
    >>> info = ctypes.c_ulong(8)
    >>> kernel32.SetFileInformationByHandle(h, 21, ctypes.byref(info), ctypes.sizeof(info))
    1
    >>> os.close(fd)
    >>> os.path.exists('spam.txt')
    True

If FILE_DISPOSITION_ON_CLOSE is set with FILE_DISPOSITION_DELETE (1) and FILE_DISPOSITION_POSIX_SEMANTICS (2), the delete-on-close behavior is changed to use POSIX semantics, which immediately unlinks the file even if there are existing opens. For example:

    >>> fd = os.open('spam.txt', os.O_TEMPORARY|os.O_CREAT)
    >>> h = msvcrt.get_osfhandle(fd)
    >>> info = ctypes.c_ulong(8|2|1)
    >>> kernel32.SetFileInformationByHandle(h, 21, ctypes.byref(info), ctypes.sizeof(info))
    1

Add a second open:

    >>> fd2 = os.open('spam.txt', os.O_TEMPORARY)

Normally the second open would keep the file linked in the directory after it's 'deleted', but not with POSIX semantics:

    >>> os.close(fd)
    >>> os.path.exists('spam.txt')
    False
    >>> 'spam.txt' in os.listdir('.')
    False

---

[1] https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/ns-ntddk-_file_disposition_information_ex
msg385706 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2021-01-26 11:20
> I think truncation via TRUNCATE_EXISTING (O_TRUNC, with O_WRONLY or O_RDWR) or overwriting with CREATE_ALWAYS (O_CREAT | O_TRUNC) is at least tolerable because the caller doesn't care about the existing data. 

Yes, I had a thought that creating or truncating a file when asked is in some sense "less wrong" than deleting an existing file on open() failure, but I'm still not comfortable with it. It would be nice, for example, if an open() with O_CREAT | O_EXCL failed, then the file would indeed not be created in all cases.

>> Truncation can simply be deferred until we have the fd and then performed manually.

> If the file itself has to be overwritten (i.e. the default, anonymous data stream), as opposed to a named data stream, it would have to delete all named data streams and extended attributes in the file. Normally that's all implemented atomically in the filesystem. 

> In contrast, TRUNCATE_EXISTING (O_TRUNC) is simple to emulate, since CreateFileW implents it non-atomically with a subsequent NtSetInformationFile: FileAllocationInformation system call. 

Oh. So CREATE_ALWAYS for an existing file has a very different semantics than TRUNCATE_EXISTING, which means we can't easily use OPEN_ALWAYS with a deferred manual truncation, I see.

>> But I still don't know how to deal with O_TEMPORARY, unless there is a 
>> way to unset FILE_DELETE_ON_CLOSE on a handle.

> For now, that's possible with NTFS and the Windows API in all supported versions of Windows by using a second kernel File with DELETE access, which is opened before the last handle to the first kernel File is closed. After you close the first open, use the second one to call SetFileInformation: FileDispositionInfo to undelete the file.

So the idea is to delete the file for a brief period, but then undelete it. As I understand it, any attempt to open the file while it's in the deleted state (but still has a directory entry) will fail. This is probably not critical since it could happen only on an unlikely failure of _open_oshandle(), but is still less than perfect.

> Windows 10 supports additional flags with FileDispositionInfoEx (21), or NTAPI FileDispositionInformationEx [1]. This provides a better way to disable or modify the delete-on-close state per kernel File object, if the filesystem supports it.

This is nice and would reduce our non-atomicity to just the following: if _wopen() would have failed even before CreateFile() (e.g. due to EMFILE), our reimplementation could still create a handle with FILE_DELETE_ON_CLOSE, so if our process is terminated before we unset it, we'll still lose the file. But it seems like the best we can get in any situation when we need to wrap a handle with an fd.

===

Regarding using WriteFile()-with-OVERLAPPED approach in FileIO, I've looked at edge cases of error remapping in MSVCRT write() again. If the underlying WriteFile() fails, it only remaps ERROR_ACCESS_DENIED to EBADF, presumably to deal with write() on a O_RDONLY fd. But if WriteFile() writes zero bytes and does not fail, it gets more interesting:

* If the file type is FILE_TYPE_CHAR and the first byte to write was 26 (Ctrl-Z), it's treated as success. I don't think I understand: it *seems* like it's handling something like writing to the *input* of a console, but I'm not sure it's even possible in this manner.

* Anything else is a failure with ENOSPC. This is mysterious too.

I've checked how java.io.FileOutputStream deals with WriteFile() succeeding with zero size (just to compare with another language runtime) and haven't found any special handling[1]. Moreover, it seems like FileOutputStream.write(bytes[]) will enter an infinite loop if WriteFile() always returns 0 for a non-empty byte array[2]. So it's unclear to me what MSVCRT is up to here and whether FileIO should do the same.

[1] https://github.com/openjdk/jdk/blob/b4ace3e9799dab5970d84094a0fd1b2d64c7f923/src/java.base/windows/native/libjava/io_util_md.c#L522
[2] https://github.com/openjdk/jdk/blob/b4ace3e9799dab5970d84094a0fd1b2d64c7f923/src/java.base/share/native/libjava/io_util.c#L195
msg385740 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-01-26 23:48
> So the idea is to delete the file for a brief period, but then 
> undelete it. 

Currently NTFS defaults to using classic delete semantics for delete-on-close. However, it supports POSIX delete semantics for delete-on-close, so the default could change in a future release. When delete-on-close uses POSIX semantics, closing the last handle to the kernel File immediately unlinks the file. The upside is that a filesystem that uses POSIX delete semantics for delete-on-close should support SetFileInformationByHandle: FileDispositionInfoEx, which allows directly removing the delete-on-close flag.

That said, I can't stomach having to manually overwrite, truncate, delete and undelete files in order to avoid side effects if _open_osfhandle() fails. IMO, if opening a file has serious side effects, then we need a public API to allocate the fd beforehand. Or the idea needs to be put on hold until Python divorces itself from the C runtime's low I/O layer.

> If the file type is FILE_TYPE_CHAR and the first byte to write 
> was 26 (Ctrl-Z), it's treated as success. I don't think I 
> understand: it *seems* like it's handling something like writing 
> to the *input* of a console, but I'm not sure it's even possible 
> in this manner.

Writing to the console input buffer is possible, but not supported with WriteFile() or WriteConsoleW(). It requires WriteConsoleInputW(), which writes low-level input records.

ReadFile() on a console input handle is special cased to return that 0 bytes were read if the string read from the console starts with Ctrl+Z. But I don't know of a device that special cases WriteFile() like this. The documentation of _write() says "[w]hen writing to a device, _write treats a CTRL+Z character in the buffer as an output terminator". Whatever this meant in the past in OS/2 or DOS, I doubt that it's meaningful nowadays.

> Anything else is a failure with ENOSPC. This is mysterious too.

Possibly someone picked an error code that was good enough. Maybe it was selected for the case of a full pipe that's in non-blocking mode. The named pipe device doesn't fail an NtWriteFile() system call for a non-blocking pipe when there isn't enough space. It simply succeeds with 0 bytes written. For example:

    >>> fdr, fdw = os.pipe()
    >>> hw = msvcrt.get_osfhandle(fdw)
    >>> win32pipe.SetNamedPipeHandleState(hw, 1, None, None)
    >>> os.write(fdw, b'a'*4096)
    4096
    >>> os.write(fdw, b'a')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    OSError: [Errno 28] No space left on device

A POSIX developer would expect this case to fail with EAGAIN and raise BlockingIOError. But support for non-blocking mode is poorly implemented by pipes in Windows. Developers are encouraged to use asynchronous I/O instead.
History
Date User Action Args
2021-01-26 23:48:46eryksunsetmessages: + msg385740
2021-01-26 11:20:53izbyshevsetmessages: + msg385706
2021-01-23 04:47:50eryksunsetmessages: + msg385523
2021-01-22 18:51:13izbyshevsetmessages: + msg385506
2021-01-22 04:53:11eryksunsetmessages: + msg385477
2021-01-21 22:56:20izbyshevsetmessages: + msg385456
2021-01-21 22:23:35eryksunsetmessages: + msg385454
2021-01-21 17:46:40izbyshevsetmessages: + msg385437
2021-01-20 07:39:31eryksunsetmessages: + msg385314
2021-01-19 13:07:49izbyshevsetmessages: + msg385258
2020-12-09 03:16:13izbyshevsetkeywords: + patch
stage: patch review
pull_requests: + pull_request22575
2020-12-09 03:12:46izbyshevcreate