This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eryksun
Recipients eryksun, mbrijun@gmail.com, paul.moore, steve.dower, tim.golden, zach.ware
Date 2020-03-28.14:57:07
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1585407427.66.0.678568192152.issue40095@roundup.psfhosted.org>
In-reply-to
Content
> C:\Users>fsutil file queryfileid u:\test\test.jpg
> File ID is 0x00000000000029d500000000000004ae

ReFS uses a 128-bit file ID, which I gather consists of a 64-bit directory ID and a 64-bit relative ID. (Take this with a grain of salt. AFAIK, Microsoft hasn't published a spec for ReFS.) The latter is 0 for the directory itself and increments by 1 for each file created in the directory, with no reuse of previous values if a file is deleted or moved. If that's correct, and if "test.jpg" was created in "\test", then the directory ID of "\test" is 0x29d5, and the relative file ID is 0x4ae. 

> >>> from pathlib import Path
> >>> hex(Path('U:/test/test.jpg').stat().st_ino)
> '0x4000000004ae29d5'

os.stat calls WINAPI GetFileInformationByHandle, which returns a 64-bit file ID. It appears that ReFS generates this ID by concatenating the relative ID and directory ID in a way that is "not guaranteed to be unique" according to the BY_HANDLE_FILE_INFORMATION [1] docs. 

I haven't checked whether this 64-bit file ID can even be used successfully with OpenFileById [2]. It could be that ReFS simply fails an open-by-ID request unless it includes the full 128-bit ID (i.e. ExtendedFileIdType).

You can request the 128-bit ID as a FILE_ID_128 record (an array of 16 bytes) via GetFileInformationByHandleEx: FileIdInfo [3][4]. Maybe os.stat should try to query the 128-bit ID and use it as st_ino (or st_ino_128) if it's available. However, looking into my crystal ball, I don't see this happening, unless someone makes a strong case in its favor.

> The problem does *not* exist on an NTFS volume:
> 
> C:\Users>fsutil file queryfileid o:\OneDrive\test\test.jpg
> File ID is 0x0000000000000000000300000001be39

NTFS uses a 64-bit file ID, which consists of a 48-bit MFT record number and a 16-bit sequence number. The latter gets incremented when an MFT record is reused in order to detect stale references. In the above case, the 48-bit record number is 0x00000001be39, and the sequence number is 0x0003.

[1]: https://docs.microsoft.com/en-us/windows/win32/api/fileapi/ns-fileapi-by_handle_file_information
[2]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-openfilebyid
[3]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-getfileinformationbyhandleex
[4]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_id_info
History
Date User Action Args
2020-03-28 14:57:07eryksunsetrecipients: + eryksun, paul.moore, tim.golden, zach.ware, steve.dower, mbrijun@gmail.com
2020-03-28 14:57:07eryksunsetmessageid: <1585407427.66.0.678568192152.issue40095@roundup.psfhosted.org>
2020-03-28 14:57:07eryksunlinkissue40095 messages
2020-03-28 14:57:07eryksuncreate