This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eryksun
Recipients Cezary.Wagner, eryksun, paul.moore, steve.dower, tim.golden, zach.ware
Date 2020-06-25.22:04:21
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1593122661.8.0.334087447719.issue41106@roundup.psfhosted.org>
In-reply-to
Content
> Does it make the most sense for us to make .flush() also do an 
> implicit .fsync() (when it's actually a file object)?

Standard I/O in the Windows C runtime supports a "c" commit mode that causes fflush to call _commit() on the underlying fd [1]. Perhaps Python should support a similar "c" or "s" mode that makes a flush implicitly call fsync / _commit. 

But you may not be in control of flushing the file if it's being written to by a third-party library or application. Calling os.[l]stat works around the problem, but only with NTFS. It doesn't help with FAT32 / exFAT.

FAT filesystems update the last-write time when the file object is flushed or closed. It depends on the FO_FILE_MODIFIED flag in the file object or the CCB_FLAG_USER_SET_LAST_WRITE (from SetFileTime) in the file object's context control block (CCB). But opening, and even flushing, a file doesn't synchronize the context of other opens. Thus one can call os.stat (not even a scandir problem) repeatedly on a file and observe st_size changing while st_mtime remains constant:

    >>> filepath = 'C:/Mount/TestFat32/test/spam.txt'
    >>> f = open(filepath, 'w')
    >>> s = os.stat(filepath); s.st_size, s.st_mtime
    (0, 1593116028.0)

    >>> print('spam', file=f, flush=True)
    >>> s = os.stat(filepath); s.st_size, s.st_mtime
    (6, 1593116028.0)

The last-write time gets updated by closing or flushing the kernel file object that was used to write to the file. 

    >>> os.fsync(f.fileno())
    >>> s = os.stat(filepath); s.st_size, s.st_mtime
    (6, 1593116044.0)

Another problem is stale entries for NTFS hard links, which can lead to getting a completely incorrect stat result via os.scandir -- wrong timestamps, wrong file size, and wrong file attributes.

An NTFS file's MFT record contains its timestamps, size, and attributes in a $STANDARD_INFORMATION attribute. This reliable information is what os.[l]stat and os.fstat query. But it gets duplicated in per-link $FILE_NAME attributes that directories index. The duplicated info for a link gets synchronized to the standard info when the link is accessed, but other links to the file do not get updated, and their values may be completely wrong. For example (using the scan function from my previous post):

    >>> filepath1 = 'C:/Mount/TestNtfs/test/spam1.txt'
    >>> filepath2 = 'C:/Mount/TestNtfs/test/spam2.txt'
    >>> f = open(filepath1, 'w')
    >>> os.link(filepath1, filepath2)
    >>> s = scan(filepath2).stat(); s.st_size, s.st_mtime
    (0, 1593116055.7695396)

    >>> print('spam', file=f, flush=True)
    >>> s = scan(filepath2).stat(); s.st_size, s.st_mtime
    (0, 1593116055.7695396)

    >>> os.fsync(f.fileno())
    >>> s = scan(filepath2).stat(); s.st_size, s.st_mtime
    (0, 1593116055.7695396)

    >>> f.close()
    >>> s = scan(filepath2).stat(); s.st_size, s.st_mtime
    (0, 1593116055.7695396)

As shown, flushing or closing the file object for the "spam1.txt" link is not reflected in the entry for the "spam2.txt" link. The directory entry for the link is only updated when the link is accessed:

    >>> f = open(filepath2)
    >>> s = scan(filepath2).stat(); s.st_size, s.st_mtime
    (6, 1593116062.2080283)

---

[1] Linking commode.obj should enable commit-mode by default. But it's broken because __acrt_stdio_parse_mode is buggy. It initializes _stdio_mode to the global _commode value, but then it clobbers it when setting the required "r", "w", or "a" open mode.
History
Date User Action Args
2020-06-25 22:04:21eryksunsetrecipients: + eryksun, paul.moore, tim.golden, zach.ware, steve.dower, Cezary.Wagner
2020-06-25 22:04:21eryksunsetmessageid: <1593122661.8.0.334087447719.issue41106@roundup.psfhosted.org>
2020-06-25 22:04:21eryksunlinkissue41106 messages
2020-06-25 22:04:21eryksuncreate