This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eryksun
Recipients craigh, eric.fahlgren, eryksun, jamercee, steve.dower, tim.golden, zach.ware
Date 2021-03-22.09:00:48
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1616403650.02.0.940345907974.issue23407@roundup.psfhosted.org>
In-reply-to
Content
Python 3.8 introduced some behavior changes to how reparse points are supported, but generalized support for handling name-surrogate reparse points as symlinks was not implemented. Python continues to set S_IFLNK in st_mode only for IO_REPARSE_TAG_SYMLINK reparse points. This ensures that if os.path.islink() is true, the link can be read and copied exactly via os.readlink() and os.symlink(). Otherwise, islink() could be true but readlink() will fail or symlink() will be used to mistakenly copy a mountpoint as a symlink. 

A mountpoint is not equivalent to a symlink in a few cases. The major difference is that mountpoints are evaluated on the server side in a remote path, targeting devices on the server, whereas symlinks are evaluated on the client side, targeting devices on the client (e.g. its "C:" drive) and are subject to the client system's L2R (local to remote), L2L, R2L, and R2R symlink policy. Replacing a mountpoint with a symlink means that, at best, the path will no longer work when accessed remotely, and at worst the client will allow resolving the target locally to something that's dangerously wrong.

Another difference is how the kernel handles mounpoints when opening a path. The target of a mountpoint does not replace the previously traversed path components in the opened path, whereas the target path of a symlink does replace the opened path. The previously traversed path matters when the kernel resolves ".." components in the target of a relative symlink. For example, a relative symlink that traverses up the tree with ".." components may have been tested on a traversed directory, which worked fine. Then later the directory was replaced with a mountpoint (junction) for compatibility, which continued to work fine. But after a CopyTree() that naively replaces the mountpoint with a symlink, the copied relative symlink is either broken, or worse, it resolves to a target that's dangerously wrong.

A generalization of the readlink() and symlink() combination could be implemented to copy any type of name-surrogate reparse point. If Python had something like that, then it could reasonably support any name-surrogate reparse point as a "symlink". That's not without problems, considering the behavior isn't the same and APIs and other applications may only support IO_REPARSE_TAG_SYMLINK in various cases, but sometimes perfect is the enemy of good.

That said, os.walk() can still special case mountpoints and other name-surrogate reparse points. To support cases like this, the lstat() result was extended to include the st_reparse_tag value of name-surrogate reparse points. The stat module has the IO_REPARSE_TAG_SYMLINK and IO_REPARSE_TAG_MOUNT_POINT constants. A simple function that checks for a name-surrogate reparse point could be added as well -- i.e. bool(reparse_tag & 0x20000000).

---

Using st_reparse_tag to abstract checking the file type is awkward. I wanted to support a keyword-only parameter in Windows to expand the 'symlink' domain to include all name-surrogate reparse points. This parameter would have been added to os.[l]stat(), DirEntry.stat(), DirEntry.is_dir(), and DirEntry.is_file(), as well as os.path.islink() and DirEntry.is_symlink(). By default only IO_REPARSE_TAG_SYMLINK would have been handled as a symlink. But this idea wasn't accepted. Instead, custom checks have to be implemented whenever a problem needs the expanded 'symlink' domain.
History
Date User Action Args
2021-05-02 08:28:45eryksununlinkissue23407 messages
2021-03-22 09:00:50eryksunsetrecipients: + eryksun, tim.golden, craigh, zach.ware, steve.dower, jamercee, eric.fahlgren
2021-03-22 09:00:50eryksunsetmessageid: <1616403650.02.0.940345907974.issue23407@roundup.psfhosted.org>
2021-03-22 09:00:50eryksunlinkissue23407 messages
2021-03-22 09:00:48eryksuncreate