Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readlink on Windows cannot read app exec links #82015

Closed
zooba opened this issue Aug 12, 2019 · 56 comments
Closed

readlink on Windows cannot read app exec links #82015

zooba opened this issue Aug 12, 2019 · 56 comments
Assignees
Labels
3.8 only security fixes 3.9 only security fixes OS-windows

Comments

@zooba
Copy link
Member

zooba commented Aug 12, 2019

BPO 37834
Nosy @pfmoore, @tjguk, @ned-deily, @ambv, @zware, @eryksun, @zooba, @miss-islington
PRs
  • bpo-37834: Normalise handling of reparse points on Windows #15231
  • [3.8] bpo-37834: Normalise handling of reparse points on Windows (GH-15231) #15370
  • bpo-37834: Fix test on Windows 7 #15377
  • [3.8] bpo-37834: Fix test on Windows 7 (GH-15377) #15378
  • bpo-37834: Prevent shutil.rmtree exception #15602
  • [3.8] bpo-37834: Prevent shutil.rmtree exception (GH-15602) #15603
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/zooba'
    closed_at = <Date 2019-08-30.09:05:01.404>
    created_at = <Date 2019-08-12.18:49:21.984>
    labels = ['3.8', '3.9', 'OS-windows']
    title = 'readlink on Windows cannot read app exec links'
    updated_at = <Date 2019-08-30.09:05:01.403>
    user = 'https://github.com/zooba'

    bugs.python.org fields:

    activity = <Date 2019-08-30.09:05:01.403>
    actor = 'lukasz.langa'
    assignee = 'steve.dower'
    closed = True
    closed_date = <Date 2019-08-30.09:05:01.404>
    closer = 'lukasz.langa'
    components = ['Windows']
    creation = <Date 2019-08-12.18:49:21.984>
    creator = 'steve.dower'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 37834
    keywords = ['patch']
    message_count = 56.0
    messages = ['349486', '349499', '349501', '349502', '349504', '349541', '349591', '349597', '349601', '349611', '349617', '349618', '349622', '349626', '349747', '349749', '349758', '349770', '349779', '349786', '349808', '349812', '349813', '349818', '349827', '349829', '349831', '349834', '349835', '349839', '349841', '349875', '349876', '349888', '349889', '349890', '349969', '349970', '349972', '349975', '349981', '350013', '350106', '350121', '350125', '350141', '350144', '350146', '350148', '350814', '350815', '350818', '350824', '350825', '350831', '350834']
    nosy_count = 8.0
    nosy_names = ['paul.moore', 'tim.golden', 'ned.deily', 'lukasz.langa', 'zach.ware', 'eryksun', 'steve.dower', 'miss-islington']
    pr_nums = ['15231', '15370', '15377', '15378', '15602', '15603']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue37834'
    versions = ['Python 3.8', 'Python 3.9']

    @zooba
    Copy link
    Member Author

    zooba commented Aug 12, 2019

    The IO_REPARSE_TAG_APPEXECLINK was introduced for aliases to installed UWP apps that require activation before they can be executed.

    Currently these are in an unusual place as far as Python support goes - stat() fails where lstat() succeeds, but the lstat() result doesn't have the right flags to make is_link() return True.

    It's not clear whether we *should* treat these as regular symlinks, given there are a number of practical differences, but since we can enable it I'm going to post a PR anyway.

    @zooba zooba added 3.8 only security fixes 3.9 only security fixes OS-windows labels Aug 12, 2019
    @zooba
    Copy link
    Member Author

    zooba commented Aug 12, 2019

    Added an update with a new stat field (st_reparse_tag) so that we can tell them apart, which also means I can add a test for readlink.

    @zooba zooba self-assigned this Aug 12, 2019
    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 12, 2019

    Symlinks are specially interpreted by the file API, I/O manager, and network redirector. So I think they should remain a separate category. readlink() and is_link() should be reserved for POSIX symlinks, i.e. only IO_REPARSE_TAG_SYMLINK.

    These app-exec reparse points are not name surrogates (i.e. 0x2000_0000 set in the tag), and they lack an associated filter driver that handles them in normal file operations. I/O syscalls that try to reparse the path fail with STATUS_IO_REPARSE_TAG_NOT_HANDLED.

    It's also messy to conflate different things in general for the sake of a particular case. For example, if code sees the app-exec link as a symlink because ntpath.is_link() is true, it may try to copy it via os.readlink() and os.symlink(). This creates a different type of reparse point, if the user even has the symlink privilege, which may not behave the same as the original app-exec link in a CreateProcessW call. (I don't know what the end goal is for the Windows team in terms of allowing app executables under "%ProgramFiles%\WindowsApps" to be executed directly.)

    How about adding a separate function such as nt.read_reparse_point() that's able to read reparse points that are documented (most types are opaque) and can be interpreted as a string, or a tuple of strings? The internal implementation could be shared with os.readlink.

    The current behavior of follow_symlinks in stat() is problematic. It should be limited to symlinks. So another idea that I think could prove generally useful is to extend stat() and lstat() with a follow_reparse_points=True parameter. If false, it would imply follow_symlinks is false. Explicitly passing follow_symlinks=True with follow_reparse_points=False would be an error. With follow_symlinks=False and follow_reparse_points=True, it would only open a symlink and follow all other types of reparse points. The stat result would gain an st_reparse_tag field (already added by your PR), which would be non-zero whenever a reparse point is opened.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 12, 2019

    How about adding a separate function such as nt.read_reparse_point()

    If we support reading junctions, this should be using the substitute name (with \??\ replaced by \\?\) instead of the print name. For both symlinks and junctions, [MS-FSCC] 2.1.2 states that the print name SHOULD be informative, not that it MUST be, where SHOULD and MUST are defined by RFC2119. The print name can be non-informative for some reason (e.g. maybe to use the whole 16 KiB buffer for the target path.) For symlinks that's not normally observed, since CreateSymbolicLinkW is conservative. For junctions, applications and frameworks use a low-level IOCTL to set them, and they're not necessarily bothering to set a print name. For example, PowerShell omits it:

    PS C:\Mount> new-item Spam -type junction -value C:\Temp\Spam | out-null
    PS C:\Mount> fsutil.exe reparsepoint query spam
    Reparse Tag Value : 0xa0000003
    Tag value: Microsoft
    Tag value: Name Surrogate
    Tag value: Mount Point
    Substitue Name offset: 0
    Substitue Name length: 32
    Print Name offset:     34
    Print Name Length:     0
    Substitute Name:       \??\C:\Temp\Spam
    

    This mount point works fine despite the lack of a print name.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 12, 2019

    Thanks Eryk for your valuable response :)

    readlink() and is_link() should be reserved for POSIX symlinks, i.e. only IO_REPARSE_TAG_SYMLINK.

    I'm okay with that reasoning. Honestly, the only real problem I've seen is in applications that use a reimplementation of spawn() rather than the UCRT's version (which handles the reparse point jsut fine).

    they lack an associated filter driver that handles them in normal file operations

    Also true, and likely to be a cause of more problems. But not ours to fix :)

    How about adding a separate function such as nt.read_reparse_point() that's able to read reparse points [...]? The internal implementation could be shared with os.readlink.

    Maybe it can just return bytes and let the caller do the parsing?

    The current behavior of follow_symlinks in stat() is problematic. It should be limited to symlinks.

    This sounds like a good option to me, too. So that would suggest that Modules/posixmodule.c in win32_xstat_impl would verify both FILE_ATTRIBUTE_REPARSE_POINT and IO_REPARSE_TAG_SYMLINK? And if the tag is different it'll return information about the reparse point rather than the target?

    The stat result would gain an st_reparse_tag field (already added by your PR), which would be non-zero whenever a reparse point is opened.

    I agree this can stay. We don't need the rest of the logic here - callers who care to follow non-symlink reparse points can use the new nt.read_reparse_point() function to do it.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 13, 2019

    Honestly, the only real problem I've seen is in applications that use
    a reimplementation of spawn() rather than the UCRT's version (which
    handles the reparse point jsut fine).

    I looked into this spawn problem. It's due to Cygwin's spawnve, which calls NtOpenFile to open the file, and then memory-maps it and reads the image header [1]. For an app-exec link, this fails with STATUS_IO_REPARSE_TAG_NOT_HANDLED, which is translated to Windows ERROR_CANT_ACCESS_FILE (1920). In turn, Cygwin translates this error to its to default errno value, EACCES. (The Windows CRT uses EINVAL as the default.)

    [1] https://github.com/cygwin/cygwin/blob/aa55d22cb55d67d7f77ee9d58f9016c42c3ee495/winsup/cygwin/spawn.cc#L1035

    Maybe it can just return bytes and let the caller do the parsing?

    It could parse out as much as possible and return a struct sequence:

    ReparseTag (int)
    ReparseGuid (string, or None for Microsoft tags)
    SymlinkFlags (int, SYMLINK_FLAG_RELATIVE) 
    PrintName (string or None)
    SubstituteName (string or None)
    DataBuffer (bytes)
    

    Only the ReparseTag and DataBuffer fields would always be present. ReparseGuid would be present for non-Microsoft tags (i.e. bit 31 is unset). The result could maybe have one or more fields for the app-exec-link type.

    This sounds like a good option to me, too. So that would suggest that
    Modules/posixmodule.c in win32_xstat_impl would verify both
    FILE_ATTRIBUTE_REPARSE_POINT and IO_REPARSE_TAG_SYMLINK? And if the
    tag is different it'll return information about the reparse point
    rather than the target?

    Maybe have two non-overlapping options, follow_symlinks and follow_nonlinks. The latter might be more sensible as a single option for lstat(), since it never follows symlinks. Then if either follow_symlinks or follow_nonlinks is false, stat has to open the reparse point and get the tag to determine whether it should reopen with reparsing enabled.

    The straight-forward way to get the tag is GetFileInformationByHandleEx: FileAttributeTagInfo. This should succeed if the file system supports reparse points. But still check for FILE_ATTRIBUTE_REPARSE_POINT in the attributes before trusting the tag value. If the call fails, assume it's not a reparse point and proceed to GetFileInformationByHandleW.

    Another tag to look for is IO_REPARSE_TAG_AF_UNIX (0x8000_0023). It's relatively new, so I haven't used it yet. I suppose it should be mapped S_IFSOCK in the stat result.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 13, 2019

    I looked into this spawn problem. It's due to Cygwin's spawnve, which calls NtOpenFile to open the file, and then memory-maps it and reads the image header [1].

    Great, that's roughly what I suspected. Unfortunately, I've been told that looking into Cygwin's (GPL'd) code myself is going to cost me two weeks of dev work ("cooling off" period), so someone else will have to help them fix it.

    It could parse out as much as possible and return a struct sequence

    TBH, I feel like that's more work than is worth us doing for something that will be relatively rarely used, will live in the stdlib, and is obviously something that will become outdated as Microsoft adds new reparse points.

    If we return the parsed data as opaque bytes then someone can write a simple PyPI package to extract the contents. (Presumably the reparse tag will have come from stat() already.)

    Maybe have two non-overlapping options, follow_symlinks and follow_nonlinks

    I read this suggestion the first time and I think it would send the message that we are more capable than we really are :)

    In theory, we can't follow any reparse point that isn't documented as being followable and provides the target name is a stable, documented manner. The appexec links don't do this (I just looked at the returned buffer), so we really should just not follow them. They exist solely so that CreateProcess internally returns a unique error code that can be handled without impacting regular process start, which means we *don't* want to follow them.

    Again, if someone writes the PyPI package to parse all known reparse points, they can do that.

    Now, directory junctions are far more interesting. My gut feel is that we should treat them the same as symlinks (with respect to stat vs. lstat) for consistency, but I'll gladly defer to you (Eryk) for the edge cases that may introduce.

    ---

    My plan for PR 15231 is:

    • make os.stat only follow symlinks and junctions
    • make os.readlink only read symlinks and junctions
    • add st_reparse_tag field to os.stat

    @zooba
    Copy link
    Member Author

    zooba commented Aug 13, 2019

    If we support reading junctions, this should be using the substitute name (with \??\ replaced by \\?\) instead of the print name.

    GetFinalPathName() does this conversion for us, any reason not to use that? (GetFullPathName() doesn't seem to recognize the \??\ prefix and will prepend the current drive)

    @zooba
    Copy link
    Member Author

    zooba commented Aug 13, 2019

    Latest PR update uses GetFinalPathName to resolve SubstituteName, returning it unmodified on failure (e.g. symlink where the target file no longer exists).

    Replacing "\??\" with "\\?\" in place is trivial though, as we start with a mutable buffer. I'm just not clear that it's as simple as that, though.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 13, 2019

    I feel like that's more work than is worth us doing for something that
    will be relatively rarely used, will live in the stdlib, and is
    obviously something that will become outdated as Microsoft adds new
    reparse points.

    Junctions (NT 5) and symlinks (NT 6) are stable. So if os.read_reparse_point only returns the unparsed bytes, maybe add os.read_junction as well. I know other projects overload this on POSIX readlink. They're both name-surrogate reparse points, but they have different constraints and behavior.

    The I/O manager tries to make a junction behave something like a hard link to a directory, with the addition of being able to link across local volumes. This is in turn relates to how it evaluates relative symbolic links. For example, if "C:/Junction" and "C:/Symlink" both target r"\\?\C:\Temp1\Temp2", and there's a relative symlink "C:/Temp1/Temp2/foo_link" that targets r"..\foo", then "C:/Junction/foo_link" references "C:/foo" but "C:/Symlink/foo_link" references "C:/Temp1/foo".

    Another difference is with remote filesystems. SMB special cases symlinks to have the server send the reparse request over the wire to be evaluated on the client side. (Refer to [MS-SMB2] 2.2.2.1 Symbolic Link Error Response, and the subsequent section about client-side handling of this error.) So an absolute symlink on the server that targets r"\\?\C:\Windows" actually references the client's "C:/Windows" directory, whereas the same junction target would reference the server's "C:/Windows" directory. The symlink evaluation will succeed only if the client's R2L (remote to local) policy allows it. Symlinks can also target remote devices, depending on the L2R and R2R policy settings. Junctions are restricted to local devices.

    In theory, we can't follow any reparse point that isn't documented as
    being followable and provides the target name is a stable, documented
    manner.

    To follow a reparse point, we're just calling CreateFileW the normal way, without FILE_FLAG_OPEN_REPARSE_POINT. The Windows API also does this (usually via NtOpenFile, but this has a similar FILE_OPEN_REPARSE_POINT option) for tags it doesn't handle. That's why MoveFileExW (os.rename and os.replace) fails on one of these app-exec links. In some cases, it adds a third open attempt if the reparse point isn't handled. This is important for DeleteFileW (os.remove) and RemoveDirectoryW (os.rmdir) because we should be able to delete a bad reparse point.

    The appexec links don't do this (I just looked at the returned
    buffer), so we really should just not follow them. They exist solely
    so that CreateProcess internally returns a unique error code that can
    be handled without impacting regular process start, which means we
    *don't* want to follow them.

    I know, so a regular stat() will fail. I think for an honest result, stat() should fail for a reparse point that can't be handled. Scripts can use stat(path, follow_nonlinks=False) or stat(path, follow_reparse_points=False), or however this eventually gets parameterized to force opening all reparse points.

    Now, directory junctions are far more interesting. My gut feel is that
    we should treat them the same as symlinks (with respect to stat vs.
    lstat) for consistency

    Junctions are their own thing. They're mount points that behave like Unix volume mounts (in Windows, target the root directory of a volume device named by its "Volume{...}" name) or Unix bind mounts (in Windows, target arbitrary directories on any local volume; in Linux it's a mount created with --bind or FUSE bindfs). Bind-like junctions are also similar to DOS subst drives (e.g. "W:" -> "C:/Windows") and UNC shares. These are all mount points of one sort or another.

    OTOH, the base device names such as "//?/C:" and "//?/Volume{...}", without a specified root directory, are aliases (object symlinks) for an NT device such as r"\Device\HarddiskVolume2". These paths open the volume itself, not the mounted filesystem, so they're not like Unix mount points. They're like Unix '/dev/sda1' device paths, except in Unix, devices don't have their own namespaces, so it would be nonsense to open "/dev/sda1/".

    RemoveDirectoryW for a volume mount is special cased to call DeleteVolumeMountPointW, which notifies the mount-point manager. It won't do this for a junction that targets the same volume root directory via the DOS drive-letter name -- or any other device alias for that matter (e.g. Windows 10 creates "\\?\BootPartition" as an alternative named for the system "C:" drive). So bind-like mounts are different from volume mounts, but both are different from symlinks.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 13, 2019

    Replacing "\??\" with "\\?\" in place is trivial though, as we start
    with a mutable buffer. I'm just not clear that it's as simple as that,
    though.

    If the path starts with "\\??\\" we can just change the first question mark to a backslash. For a symlink, fail at this step if Flags includes SYMLINK_FLAG_RELATIVE, which would be inconsistent data.

    "\\??\\" is the NT prefix for the caller's local (logon session) device directory. This implicitly shadows the global device directory, "\\Global??". We don't use NT's real "\\Device" directory in Windows, but it's available as "//?/GlobalRoot/Device" or "//?/Global/GlobalRoot/Device".

    "\\??\\" shouldn't be used directly in Windows programming, since GetFullPathNameW (often implicitly called) doesn't recognize it, but some API functions will pass it along, even though they should really fail the call. It will work until it doesn't, and by then we could have a right mess.

    If a path starts with exactly "\\\\?\\" (backslash only), Windows simply copies it verbatim, except for changing the prefix to "\\??\\". Other Windows device prefixes where the domain is "." or "?" can use any mix of forward slash and backslash because they get normalized first (e.g. "//?\\" -> "\\\\?\\"). Only an initial "\\\\?\\" bypasses the normalization step.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 13, 2019

    I think we'll want bpo-9949 merged as well, so that ntpath.realpath() does its job. Certainly the tests would benefit from it.

    Until then, I think it makes sense for os.readlink() to handle the prefix and _getfinalpathname() call, but leave nt.readlink() as returning the raw value.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 13, 2019

    Until then, I think it makes sense for os.readlink() to handle the
    prefix and _getfinalpathname() call, but leave nt.readlink() as
    returning the raw value.

    os.readlink() shouldn't resolve the final path or realpath(). It should simply return the link substitute path, which starts with "\\\\?\\". I'm wary of trying to return it without the prefix. We would need a function that's shared with the proposed implementation of realpath() to determine whether the given path (not the final path) is safe to return as a normal DOS or UNC path.

    ---

    BTW, I looked into how CreateFileW is supporting "\\??\\". It turns out at some point they changed RtlpDosPathNameToRelativeNtPathName to look for both "\\\\?\\" and "\\??\\" when determining whether to skip normalizing the path. I think that's a bad idea since the Windows API doesn't consistently support the NT prefix.

    ReactOS is supposed to target NT 5.2 (Server 2003), and it doesn't allow the NT prefix here. Refer to its implementation of RtlpDosPathNameToRelativeNtPathName_Ustr [1]. It only looks for RtlpWin32NtRootSlash ("\\\\?\\"), not RtlpDosDevicesPrefix ("\\??\\"). So I suppose Windows changed this some time between Vista and Windows 10.

    [1] https://github.com/reactos/reactos/blob/d93e516747e3220ba182f77824e8b1a8b548edae/sdk/lib/rtl/path.c#L1034

    @zooba
    Copy link
    Member Author

    zooba commented Aug 14, 2019

    I'm wary of trying to return it without the prefix.

    Me too, but suddenly adding "\\?\" to the paths breaks a lot of assumptions.

    We would need a function that's shared with the proposed implementation of realpath() to determine whether the given path (not the final path) is safe to return as a normal DOS or UNC path.

    My idea was to GetFinalPathName(path[4:])[4:] and if that fails, don't strip the prefix. Which is obviously not be perfect, but since we're not going to add a check for the LongPathsEnabled flag (let alone any of the other edge cases), we can't easily figure out whether it's safe to return manually.

    I really want a fix for this in 3.8, or else os.stat(sys.executable) may fail, but I don't think changing the result of readlink() is okay at this stage. Maybe I'll leave that out for now and just take the st_reparse_tag and stat() changes?

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 14, 2019

    I really want a fix for this in 3.8, or else os.stat(sys.executable)
    may fail

    I agree, but Python can support this without handling junctions as symlinks or limiting the reparse points that we can follow. There are reparse points for offline storage and projected file systems (e.g. VFS for Git) that should normally be traversed, and to that I would add junctions. stat() in Unix always opens a mounted directory, not the underlying directory of a mount point. Windows Python should try to be consistent with Unix where doing so is reasonable.

    Given the only option here is follow_symlinks, then the first CreateFileW call in win32_xstat_impl should only open a reparse point if follow_symlinks is false. In this case, if it happens to open a reparse point that's not a symlink, it should try to reopen with reparsing enabled. In either case, if reparsing fails because a reparse point isn't handled (ERROR_CANT_ACCESS_FILE), try to open the reparse point itself. The latter would be the second open attempt if follow_symlinks is true and the third open attempt if follow_symlinks is false.

    If a reparse point isn't handled, then there's nothing in principle that stat() could ever follow. Given that it's impractical to fail in this case, considering how much code would have to be modified, then the above compromise should suffice.

    I checked RtlNtStatusToDosError over the range 0xC000_0000 - 0xC0ED_FFFF. (In ntstatus.h, FACILITY_MAXIMUM_VALUE is 0xED, so there's no point in checking facilities 0x0EF-0xFFF.) Only STATUS_IO_REPARSE_TAG_NOT_HANDLED maps to ERROR_CANT_ACCESS_FILE, so we don't have to worry about unrelated failures using this error code. This is separate from an invalid reparse point (ERROR_INVALID_REPARSE_DATA) or a reparse point that can't be resolved (ERROR_CANT_RESOLVE_FILENAME), which should still be errors that fail the call.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 14, 2019

    but suddenly adding "\\?\" to the paths breaks a lot of assumptions.

    The unwritten assumption has been that readlink() is reading symlinks that get created by CreateSymbolicLinkW, which sets the print name as the DOS path that's passed to the call. In this case, readlink() can rely on the print name being the intended DOS path. I raised a concern in the case of reading junctions. There's no high-level API to create junctions, so we can't assume the print name is reliable. PowerShell's new-item doesn't even set a print name for junctions. That symlinks also are valid without a print name (in principle; I haven't come across it practice) lends additional weight to always using the substitute name.

    Even if we have the DOS path, resolving paths manually is still complicated if it's a relative symlink with a reserved name (DOS device; trailing dots or spaces) as the final component or if it's a long path. Reparsing a relative symlink in the kernel doesn't reserve such names and there's no MAX_PATH limit in the kernel. So using readlink() is tricky. Fortunately realpath() in Windows doesn't require a resolve loop based on readlink(). The kernel almost always knows the final path of an opened file, and we can walk the components from the end until we find one that exists.

    My idea was to GetFinalPathName(path[4:])[4:] and if that fails

    An existing file named "spam" would be a false positive for a link that targets "spam.". The internal CreateFileW call would open "spam". Also, symlinks allow remote paths, and this doesn't handle "\\\\?\\UNC" paths. More generally, a link target doesn't have to exist, so being able to access it shouldn't be a factor. I see it's also returning the result from _getfinalpathname. readlink() doesn't resolve a final, solid path. It just returns the contents of a link, which can be a relative or absolute path.

    In the proposed implementation of realpath() that I helped on for bpo-14094 (I wasn't aware of the previous work in bpo-9949), there's an _extended_to_normal function that tries to return a normal path if possible. The length of the normal path has to be less than MAX_PATH, and _getfullpathname should return the path unchanged. GetFullPathNameW is just rule-based processing in user mode; it doesn't touch the file system.

    I wish we could remove the MAX_PATH limit in this case. I think at startup we should try to call RtlAreLongPathsEnabled, even though it's not documented, and set a sys flag to indicate whether we can use long paths. Also, support a -X option and an environment variable to override automatic detection.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 14, 2019

    Given the only option here is follow_symlinks, then the first CreateFileW call in win32_xstat_impl should only open a reparse point if follow_symlinks is false. In this case, if it happens to open a reparse point that's not a symlink, it should try to reopen with reparsing enabled. In either case, if reparsing fails because a reparse point isn't handled (ERROR_CANT_ACCESS_FILE), try to open the reparse point itself. The latter would be the second open attempt if follow_symlinks is true and the third open attempt if follow_symlinks is false.

    If a reparse point isn't handled, then there's nothing in principle that stat() could ever follow. Given that it's impractical to fail in this case, considering how much code would have to be modified, then the above compromise should suffice.

    This sounds like reasonable logic. I'll take a look at updating my PR.

    In the proposed implementation of realpath() that I helped on for bpo-14094

    I totally forgot about this issue and the PR (but it looks like the contributor did too). I just posted PR 15287 today with the tests from the patch on bpo-9949 and it's looking okay - certainly an improvement over the current behaviour. But it doesn't have the change to readlink() that would add the \\?\ prefix, which means it doesn't answer that question.

    I wish we could remove the MAX_PATH limit in this case.

    The problem is that we have to remove the limit in any case where the resulting path might be used, which is what we're already trying to encourage by supporting long paths.

    Perhaps the best we can do is an additional test where we GetFinalPathName, strip the prefix, reopen the file, GetFinalPathName again and if they match then return it without the prefix. That should handle the both long path settings as transparently as we can.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 14, 2019

    Perhaps the best we can do is an additional test where we GetFinalPathName, strip the prefix, reopen the file, GetFinalPathName again and if they match then return it without the prefix. That should handle the both long path settings as transparently as we can.

    I tried this and the downside (other than hitting the filesystem a few more times) is that any unresolvable path is going to come back with the prefix - e.g. symlink cycles and dangling links.

    Maybe that's okay? The paths are going to be as valid as they can get (that is, unusable :) ) and it at least means that long paths and deliberately unnormalized paths.

    Posted an update to PR 15287 with that change.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 14, 2019

    And I just posted an update to PR 15231 with essentially a rewrite of stat() on Windows. Should be better than it was :)

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 15, 2019

    I wish we could remove the MAX_PATH limit in this case.

    The problem is that we have to remove the limit in any case where the
    resulting path might be used, which is what we're already trying to
    encourage by supporting long paths.

    Maybe it's better to ignore the MAX_PATH limit and let processes fail hard if they lack long-path support. A known and expected exception is better than unpredictable behavior (see the next paragraph for an example). That leaves the problem of a final component that's a reserved name, i.e. a DOS device name or a name with trailing dots or spaces. We have no choice but to return this case as an extended path.

    The intersection of this problem with SetCurrentDirectoryW (os.chdir) troubles me. Without long-path support, the current-directory buffer in the process parameters is hard limited to MAX_PATH, and passing SetCurrentDirectoryW an extended path can't work around this. Fair enough. But it still accepts a device path as the current directory, even though the docs do not explicitly allow it, and the implementation assumes it's disallowed. The combination is an ugly bug:

        >>> os.chdir('//./C:/Temp')
        >>> os.getcwd()
        '\\\\.\\C:\\Temp'
        >>> os.path._getfullpathname('/spam/eggs')
        '\\\\spam\\eggs'
    
        >>> os.chdir('//?/C:/Temp')
        >>> os.getcwd()
        '\\\\?\\C:\\Temp'
        >>> os.path._getfullpathname('/spam/eggs')
        '\\\\spam\\eggs'

    In order to resolve a rooted path such as "/spam/eggs", the runtime library needs to be able to figure out the current drive from the current directory. It checks for a UNC path and otherwise assumes it's a DOS drive, since it's assuming device paths aren't allowed. It ends up assuming the current directory is a DOS drive and grabs the first two characters as the drive name, which is "\\\\". Then when joining the rooted path to this 'drive', the initial slash or backslash of the rooted path gets collapsed into the preceding backslash. The result is at best a broken path, and at worst an unrelated UNC path that exists.

    I think os.chdir should raise an exception when passed a device path. In explanation, we can point to the documentation of SetCurrentDirectoryW, which explicitly states the following:

    Each process has a single current directory made up of two parts:
    
        * A disk designator that is either a drive letter followed by 
          a colon, or a server name and share name 
          (\\servername\sharename)
        * A directory on the disk designator
    

    Perhaps the best we can do is an additional test where we
    GetFinalPathName, strip the prefix, reopen the file,
    GetFinalPathName again and if they match then return it
    without the prefix. That should handle the both long path
    settings as transparently as we can.

    I assume you're talking about realpath() here, toward the end where we're working with a solid path, or rather where we have at least the beginning part of the path as a solid path, up to the first component that's inaccessible.

    For the problem of reserved names, GetFullPathNameW is all we need. This doesn't address the MAX_PATH issue. But that either works or not. It's a user-mode issue. There's nothing to resolve in the kernel. If the path is too long, then CreateFileW will fail at RtlDosPathNameToRelativeNtPathName_U_WithStatus with STATUS_NAME_TOO_LONG, before making a single system call.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 15, 2019

    I assume you're talking about realpath() here ...

    Yes, and so are you :) Let's move that discussion to bpo-9949 and/or PR 15287.

    I think os.chdir should raise an exception when passed a device path.

    When the OS starts returning an error code for this case, we can start raising an exception. It might be worth reporting these cases though, as you're right that they don't seem to be handled correctly.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 15, 2019

    [Quoting from the PR comments]

    traverse is from follow_symlinks and only applies to symlinks. It does not apply to other types of reparse points.

    I get your argument that junctions are not symlinks, but I disagree that we should try this hard to emulate POSIX semantics rather than letting the OS do its thing. We aren't reparsing anything ourselves (in the PR) - if the OS is configured to do something different because of a reparse point, we're simply going to respect that instead of trying to work around it.

    A user who has created a directory junction likely wants it to behave as if the directory is actually in that location. Similarly, a user who has created a directory symlink likely wants it to behave as if it were in that location. Powershell treats both the same for high-level operations - the LinkType attribute is the only way to tell them apart (mirrored in the st_reparse_tag field).

    If we've opened the reparse point to test for a symlink, we must reopen for all other types

    The premise here is not true - we've opened the reparse point to get the file attributes. The only reason we look at the reparse tag at all is to raise an error if the user requested traversal and despite that, we ended up at a link, and I'm becoming less convinced that should be an error anyway (this is different from nt.readlink() and ntpath.realpath(), of course, where we want to read the link and return where it points).

    nt.stat() is trying to read the file attributes, and if they are not accessible then raising is the correct behaviour, so I don't see why we should try any harder than the OS here.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 15, 2019

    Unless your point is that we should _always_ traverse junctions? In which case we have a traverse 'upgrade' scenario (calls to lstat() become calls to stat() when we find out it's a junction).

    Again, not sure why we'd want to hide the ability to manipulate the junction itself from Python users, except to emulate POSIX. And I'd imagine anyone using lstat() is doing it deliberately to manipulate the link and would prefer we didn't force them to add Windows-specific code that's even more complex.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 15, 2019

    Unless your point is that we should _always_ traverse junctions? In
    which case we have a traverse 'upgrade' scenario (calls to lstat()
    become calls to stat() when we find out it's a junction).

    If we've opened the reparse point to test IO_REPARSE_TAG_SYMLINK, and that's not the case, then we need to reopen with reparsing enabled. This is exactly what Windows API functions do in order to implement particular behavior for just symlinks or just mountpoints.

    For example, if we've opened an HSM reparse point, we must reopen to let the file-system filter driver implement its semantics to replace the reparse point with the real file from auxiliary storage and complete the request. That is the stat() result I want when I say stat(filename, follow_symlinks=False) or lstat(filename), because this file is not a symlink. It's implicitly just the file to end users -- despite whatever backend tricks are being played in the kernel to implement other behavior such as HSM. Conflating this with a symlink is not right. Lies catch up with us. We can't copy it as link via os.symlink and os.readlink, and it doesn't get treated like a symlink in API functions.

    If you want to add an "open reparse point" parameter, that would make sense. It's of some use to get the tag and implement particular behavior for types of reparse points, and particularly for name surrogates, which includes mount points (junctions).

    As to mount points, yes, I do think we should always traverse them. Please see my extended comment and the follow-up example on GitHub.

    Again, not sure why we'd want to hide the ability to manipulate the
    junction itself from Python users, except to emulate POSIX. And I'd
    imagine anyone using lstat() is doing it deliberately to manipulate
    the link and would prefer we didn't force them to add Windows-
    specific code that's even more complex.

    A mount point is not a link. ismount() and islink() can never both be true. Also, a POSIX symlink can never be a directory, which is why we make stat() pretend directory symlinks aren't directories. If the user wants a link, they can use a symlink that's created by os.symlink, mklink, new-item -type SymbolicLink, etc.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 15, 2019

    For example, if we've opened an HSM reparse point, we must reopen to let
    the file-system filter driver implement its semantics to replace the
    reparse point with the real file from auxiliary storage and complete the
    request. That is the stat() result I want when I say stat(filename,
    follow_symlinks=False) or lstat(filename), because this file is not a
    symlink. It's implicitly just the file to end users -- despite whatever
    backend tricks are being played in the kernel to implement other
    behavior such as HSM. Conflating this with a symlink is not right. Lies
    catch up with us. We can't copy it as link via os.symlink and
    os.readlink, and it doesn't get treated like a symlink in API functions.

    Okay, I get it now. So we _do_ want to "upgrade" lstat() to stat() when it's not a symlink.

    If you want to add an "open reparse point" parameter ...

    I don't want to add any parameters - I want to have predictable and reasonable default behaviour. os.readlink() already exists for "open reparse point" behaviour.

    The discussion is only about what os.lstat() returns when you pass in a path to a junction.

    As to mount points, yes, I do think we should always traverse them.
    Please see my extended comment and the follow-up example on GitHub.

    I'm still not convinced that this is what we want to do. I don't have a true Linux machine handy to try it out (Python 3.6 and 3.7 on WSL behave exactly like the semantics I'm proposing, but that may just be because it's the Windows kernel below it).

    A mount point is not a link. ismount() and islink() can never both be
    true. Also, a POSIX symlink can never be a directory, which is why we
    make stat() pretend directory symlinks aren't directories. If the user
    wants a link, they can use a symlink that's created by os.symlink,
    mklink, new-item -type SymbolicLink, etc.

    ismount() is currently not true for junctions. And I can't find any reference that says that POSIX symlinks can't point to directories, nor any evidence that we suppress symlink-to-directory creation or resolution in Python (also tested on WSL)..

    @zooba
    Copy link
    Member Author

    zooba commented Aug 15, 2019

    So we _do_ want to "upgrade" lstat() to stat() when it's not a symlink.

    Except this bug came about because we want to _downgrade_ stat() to lstat() when it's an appexeclink, because the whole point of those is to use them without following them (and yeah, most operations are going to fail, but they'd fail against the target file too).

    So we have this logic:

    def xstat(path, traverse):
        f = open(path, flags | (0 if traverse else OPEN_REPARSE_POINT))
        if !f:
            # Special case for appexeclink
            if traverse and ERROR_CANT_OPEN_FILE:
                st = xstat(path, !traverse)
                if st.reparse_tag == APPEXECLINC:
                    return st
                raise ERROR_CANT_OPEN_FILE
            # Handle "likely" errors
            if ERROR_ACCESS_DENIED or SHARING_VIOLATION:
                st = read_from_dir(os.path.split(path))
        else:
            st = read_from_file(f)
    
        # Always make the OS resolve "unknown" reparse points
        ALLOWED_TO_TRAVERSE = {SYMLINK, MOUNT_POINT}
        if !traverse and st.reparse_tag not in ALLOWED_TO_TRAVERSE:
            return xstat(path, !traverse)
    
        return st

    And the open question is just whether MOUNT_POINT should be in that set near the end. I believe it should, since the alternative is to force all Python developers to write special Windows-only code to handle directory junctions.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 16, 2019

    > I don't want to add any parameters - I want to have predictable and
    > reasonable default behaviour. os.readlink() already exists for
    > "open reparse point" behaviour.

    I'd appreciate a parameter to always open reparse points, even if a
    filter-driver or the I/O manager handles them.

    But that's only required if lstat() traverses junctions, which I'm not proposing to do.

    I'm no longer a big fan of mapping "follow_symlinks" to name surrogates
    ... But it's not up to me. If follow_symlinks means name surrogates, at
    least then lstat can open any reparse point that claims to link to
    another path and thus *should* have link-like behavior (hard link or
    soft link).

    Yeah, ultimately it looks like it'll be up to me, which is why I want as much of your input as I can get before having to make a call :)

    I'm leaning towards "if the OS follows it by default, stat() will follow it", and (given some of your later comments), "when ERROR_CANT_ACCESS_FILE occurs on a reparse point (is that the only scenario?) behave like lstat()".

    (And I carefully phrased that to not imply that Python is going to partially follow a link chain for you if the eventual end result is not valid. e.g. stat() on a symlink to an appexeclink will return the symlink, because ERROR_CANT_ACCESS_FILE was the result. Then the suggestion will be "if stat() returns a link, use os.path.realpath() if you want to resolve it as far as possible".)

    The questions for me are whether os.readlink() should also read
    junctions and exactly what follow_symlinks means in Windows. We have a
    complicated story to tell if follow_symlinks=False (lstat) opens any
    reparse point or opens just name-surrogate reparse points, and islink()
    is made consistent with this, but then readlink() doesn't work.

    I don't think it's that complicated:

    • os.lstat() will not traverse any links or reparse points
    • os.stat() will traverse any links and reparse points supported by the OS, or else return the same as os.lstat()
    • os.readlink() can read the target of a symlink or (on Windows) junction point

    If junctions are handled as symlinks, then islink(), readlink(),
    symlink() would be used to copy a junction 'link' while copying a tree
    (e.g. shutil.copytree with symlinks=True). This would transform
    junctions into directory symlinks. In this case, we potentially have a
    problem that relative symlinks in the tree no longer target the same
    files when accessed via a directory symlink instead of a junction. No
    one thinks about this problem on the POSIX side because it would be
    weird to copy a mountpoint as a symlink. In POSIX, a mountpoint is
    always seen as just a directory and always traversed.

    This isn't difficult to handle specifically if we want to, though, since the stat result now includes the reparse tag. And you're right, we probably have to handle it.

    > I'm still not convinced that this is what we want to do. I don't
    > have a true Linux machine handy to try it out (Python 3.6 and 3.7 on
    > WSL behave exactly like the semantics I'm proposing, but that may
    > just be because it's the Windows kernel below it).

    ... This is a decision in WSL's drvfs file-system driver, and I have to
    assume it's intentional.

    I assumed it was intentional as well, though it'll almost certainly be based on pragmatically getting things to work (e.g. the '/mnt/c/Document and Settings' junction... though now I try it that those don't actually work...)

    > ismount() is currently not true for junctions. And I can't find any
    > reference that says that POSIX symlinks can't point to directories,

    Our current implementation for junctions is based on GetVolumePathNameW,
    which will be true for junctions that use the "Volume{...}" name to
    mount the file-system root directory. That's a volume mount point.

    I don't know why someone decided that's the sum total of "mount point"
    in Windows. DOS drives and UNC drives can refer to arbitrary file system
    directories. They don't have to refer to file-system root directory. We
    can have "W:" -> "\\??\\C:\\Windows", etc.

    Per the docs, a mount point for ismount() is a "point in a file system
    where a different file system has been mounted". The mounted directory
    doesn't have to be the root directory of the file system. I'd relax this
    definition to include all "hard" name grafting links to other
    directories, even within the same file system. What matter to me is the
    semantics of how this differs from the soft name grafting of a symlink.

    I think the intent is that it's mounted the root of another file system. There was discussion of just using the reparse tag in bpo-9035, but it looks like from msg138197 that returning True for regular directory links was not the intent (despite the tests in that message being insufficient).

    If follow_symlinks=False applies to name surrogates, then a junction
    would be detectable via os.lstat(filename).st_reparse_tag, which is not
    only much cheaper than GetVolumePathNameW, but also more generally
    correct and consistent with DOS and UNC drive mount points.

    Ah, the future possibilities of getting that one field added in this PR :) Let's save other enhancements for another PR - especially when they won't be impacted by this one.

    S_IFDIR is suppressed for directory symlinks in the stat result. But
    os.path.isdir() is supposed to be based on os.stat, and thus follows
    symlinks. To that end, our nt._isdir is broken because it assumes
    GetFileAttributesW is sufficient. Since we're supposed to follow links,
    it's not working right for link targets that don't exist. It should
    return False in that case.

    That's easily fixed.

    I assumed stat() would return the reparse point for all tags that fail
    to reparse with ERROR_CANT_ACCESS_FILE since it's not an invalid reparse
    point, just an unhandled one that has no other significance at the file
    system level. To stat(), it can never be anything other than a reparse
    point. Whatever relevance it has depends on some other context (e.g. a
    CreateProcessW call).

    This is easy - just remove the "call again with traverse=TRUE" - and provided we don't set S_IFLNK in this case then we are being "correct", right? (Nobody has yet defined whether S_IFLNK is required for st_reparse_tag to be set.)

    That doesn't mean, however, that I wouldn't like the ability to detect
    "name surrogate" reparse points in general to implement safer behavior
    for shutil.rmtree and anything else that walks a directory tree.

    nt.lstat().st_reparse_tag and the constants in the stat module are sufficient for this, right?

    If islink() is true, then st_mode has S_IFLNK and not S_IFDIR. So we
    have a mount point that's a symlink, which is not possible in POSIX, and
    it's not a directory, which is unusual in POSIX to say the least.

    This entire discussion is because POSIX can't fully describe Windows filesystems :) The two goals need to be making the "most consistent" behaviour the default, while enabling special cases to be specialised.

    So I care less about the correct information being reported than it being acted upon in the correct way - if that means lying about junctions and mount points being links and not directories (since POSIX apparently doesn't support setting both flags?), then let's lie about it. lstat().st_reparse_tag will reveal whether it's a junction, and ismount() whether it's linking to a different device.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 16, 2019

    That was a long set of replies, so here's my summary proposed behaviour:

    (Where "links" are the generic term for the set that includes "reparse point", "symlink", "mount point", "junction", etc.)

    os.lstat(path): returns attributes for path without traversing any kind of links

    os.stat(path): traverses any links supported by the OS and returns attributes for the final file

    • on Windows, if the OS reports that a link is not followable (e.g. appexeclink), then the original link is reported. So if S_IFLNK is set in the result, the caller should, use os.path.realpath() to traverse as many links as possible to reach the unfollowable link

    os.exists(path): now returns True for links that exist but the OS reports are not followable - previously returned False (links that are followable but the target does not exist continue to return False)

    os.path.is_dir(path): now returns False for directory links where the target does not exist

    os.unlink(path): unchanged (still removes the junction, not the contents)

    os.scandir(path): DirEntry.is_symlink() and DirEntry.is_dir() will now both be True for junctions (this was always the case for symlinks to directories). DirEntry.stat(follow_symlinks=False).st_reparse_tag == stat.IO_REPARSE_TAG_MOUNT_POINT is how to specifically detect a junction.

    shutil.copytree(path): Unchanged. (requires a minor fix to continue to recursively copy through junctions (using above test), but not symlinks.)

    shutil.rmtree(path): Will now remove a junction rather than recursively deleting its contents (net improvement, IMHO)

    And as I said at the end of the long post, my main goals are:

    • the result of _using_ the returned information should be consistent across OS
    • there are ways of getting more specific information when needed

    The most impactful change would seem to be the one to DirEntry, in that users may now treat junction points more like symlinks than real directories depending on how they've set up their checks. But for the other benefits - particularly the more reliable exists() checks - I think this is worth it overall.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 16, 2019

    Where "links" are the generic term for the set that includes
    "reparse point", "symlink", "mount point", "junction", etc.)

    Why group all reparse points under the banner of 'link'? If we have a typical HSM reparse point (the vast majority of allocated tags), then all operations, such as delete and rename, act on the file itself, not simply the reparse point. We should be able to delete or rename a link without affecting the target.

    In this case, there's also no chance that the reparse point is a surrogate for another path on the system, so code that walks paths doesn't have to worry about loops with regard to these reparse points. The only practical use case I can think of for detecting/opening this type of reparse point is backup software that should avoid triggering an HSM recall. For example:

    https://www.ibm.com/support/knowledgecenter/en/SSEQVQ_8.1.0/client/r_opt_hsmreparsetag.html

    As I've previously suggested (and this is the last time because I'm becoming a broken record), lstat() should at least be restricted to opening only name-surrogate reparse points that are supposed to be like links in that they target another path in the system. Plus it also has to open unhandled reparse points.

    Personally, I'm only comfortable with opening it up to name surrogates if islink() and readlink() are still limited to just Unix-like symlinks that we can create via symlink(). Nothing changes there. It's just a restriction of how lstat() currently works. The addition of the reparse tag in the stat result enables special handling of non-symlink surrogates.

    shutil.copytree(path): Unchanged. (requires a minor fix to
    continue to recursively copy through junctions (using above test),
    but not symlinks.)

    Everyone else who relies on islink(), readlink(), and symlink() to copy symlinks isn't special casing their code to look for junctions or anything else we lump under the banner of islink(). They could code defensively if readlink() fails for a 'link' that we can't read. But that leaves the problem of readlink() succeeding for a junction. That can causes problems if the target is passed to os.symlink(), which changes the link from a hard name grafting to a soft name grafting.

    Why would we need to read the target of a junction? It's not needed for realpath() in Windows. We should only have to resolve symlinks. For example:

    C:/Mount/junction/spam/eggs
    
             junction -> Z:/bar/baz
    

    We don't have to resolve this as "Z:/bar/baz/spam/eggs", and doing so may even be wrong for someone using it to manually resolve a relative symlink. "C:/Mount/junction/spam/eggs" is a solid path. In Unix it would not be resolved by realpath(). A solid path is needed to figure out how to create a relative symlink, or how to manually resolve one for a given path.

    For example, if "foo_link" in "C:/Mount/junction/spam/eggs" targets "../../../foo", this refers to "C:/Mount/foo". On the other hand, if the junction mount point were replaced by a soft symlink, then "C:/Mount/symlink/spam/eggs" is not a solid path. "foo_link" is instead evaluated over the target path: "Z:/bar/baz/spam/eggs/foo_link", so the link resolves to "Z:/bar/foo".

    IMO, S_IFLNK need not be set for anything other than Unix-like symbolic links. We would just need to document that on Windows, lstat opens any link-like reparse point that indicates it targets another path on the system, plus any reparse point that's not handled, but that islink() is only true for actual Unix symlinks that can be created via os.symlink() and read via os.readlink().

    This preserves how islink() and readlink() currently work, while still leaving the door open to fix misbehavior in particular cases. Code, including our own code, that needs to look for the broader Windows category of "name surrogate" can examine the reparse tag. For convenience we can provide issurrogate() that checks lstat(filename).st_reparse_tag & 0x2000_0000. This can be true for directories. Also, a surrogate doesn't have to behave like a Unix "soft" symlink, i.e. it applies to "hard" mount points. In Unix, issurrogate() could just be an alias for islink() since Unix provides only one type of name surrogate.

    Currently the name surrogate category includes the following tags:

    Microsoft name surrogate (bits 31 and 29)
    
    IO_REPARSE_TAG_MOUNT_POINT                  0xA0000003
    IO_REPARSE_TAG_SYMLINK                      0xA000000C
    IO_REPARSE_TAG_IIS_CACHE                    0xA0000010
    IO_REPARSE_TAG_GLOBAL_REPARSE               0xA0000019
    IO_REPARSE_TAG_LX_SYMLINK                   0xA000001D
    IO_REPARSE_TAG_WCI_TOMBSTONE                0xA000001F
    IO_REPARSE_TAG_PROJFS_TOMBSTONE             0xA0000022
    
    Non-Microsoft name surrogate (bit 29)
    
    IO_REPARSE_TAG_SOLUTIONSOFT                 0x2000000D
    IO_REPARSE_TAG_OSR_SAMPLE                   0x20000017
    IO_REPARSE_TAG_QI_TECH_HSM                  0x2000002F
    IO_REPARSE_TAG_MAXISCALE_HSM                0x20000035
    IO_REPARSE_TAG_ALERTBOOT                    0x2000004C
    IO_REPARSE_TAG_NVIDIA_UNIONFS               0x20000054
    

    IO_REPARSE_TAG_OSR_SAMPLE is used by OSR sample code in their Windows driver curriculum, so that one is unlikely to be seen in practice. I don't know anything about the other non-Microsoft tags. NVidia's UnionFS looks interesting. Using reparse points to merge file systems is probably not the most efficient way to handle that problem, but I'm sure the devil is in the details there.

    os.unlink(path): unchanged (still removes the junction, not the
    contents)

    Whatever we're calling a link should be capable of being deleted via os.unlink. If we apply S_IFLNK, then it won't have S_IFDIR (at least how POSIX code expects it), and unlink should work on it. The current state of affairs in which unlink/remove works on a junction, which is reported by stat() as a directory, is inconsistent. It's not specified to remove directories, so nothing that it can remove should be a directory.

    shutil.rmtree(path): Will now remove a junction rather than
    recursively deleting its contents (net improvement, IMHO)

    I'd like for it to remove all name-surrogate directories like CMD's rmdir /s does. In contrast, Unix shutil.rmtree traverses into a mount point, deletes everything, and then fails because the directory is mounted and can't be removed. That's hideous, IMO.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 16, 2019

    the '/mnt/c/Document and Settings' junction... though now I try
    it that those don't actually work...)

    The security on compatibility junctions denies everyone read-data (list) access, but in Windows they can still be traversed (e.g. "C:/Documents and Settings/Public") because execute (traverse) access isn't denied, and even if it were denied, standard Windows users have SeChangeNotifyPrivilege to bypass traverse checking.

    I created a test junction named "eggs" that targets a directory named "spam" that has "spam/foo" subdirectory. I set the junction's security to match that of "Documents and Settings". In WSL, trying to traverse it to stat the "foo" subdirectory failed with EACCES, just as with "Documents and Settings/Public". After removing the entry that denies read-data access, it worked fine. There's no problem traversing "spam" directly if I set the same security on it that denies everyone read-data access; it only prevents listing the directory.

    It seems that in order to evaluate an NT junction, drvfs tries to open it with read-data access. I don't see why it would have to do that.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 16, 2019

    Why group all reparse points under the banner of 'link'?

    Because for the purposes of the list of changes beneath it, there wasn't any difference (e.g. "traverses any links supported by the OS" is more meaningful to most people, even though both of us would understand it to mean "traverses any traversable reparse points supported by the OS"). I'm not redefining them all to be the same thing, just establishing the terminology for what immediately followed.

    As I've previously suggested (and this is the last time because I'm
    becoming a broken record), lstat() should at least be restricted to
    opening only name-surrogate reparse points that are supposed to be like
    links in that they target another path in the system. Plus it also has
    to open unhandled reparse points.

    Apologies for causing the repetition. Let me summarise what I believe you're suggesting as an alternate flow (bearing in mind that only os.stat() and os.lstat() are affected here):

    os.lstat:

    • open without following reparse points
    • check the reparse tag
    • if it's a genuine symlink, return attributes of the link itself and marked ST_IFLNK
    • if it's a directory junction, call os.stat instead and return that (???)
    • if it's any name-surrogate reparse point, return attributes of the link itself but not marked ST_IFLNK
    • if it's any other reparse point, call os.stat instead and return that
    • otherwise regular handling (using hFile or FindFirstFile, etc.)

    os.stat:

    • open following reparse points
    • if the open fails with ERROR_CANT_ACCESS_FILE, try opening without following reparse points:
    • if it's a genuine symlink, ???
    • if it's a directory junction, ???
    • if it's any name-surrogate reparse point, ???
    • if it's any other reparse point, return attributes of the link itself
    • otherwise regular handling

    If you can fill in the gaps, that will help me understand exactly what you're proposing.

    > shutil.copytree(path): Unchanged. (requires a minor fix to
    > continue to recursively copy through junctions (using above test),
    > but not symlinks.)

    Everyone else who relies on islink(), readlink(), and symlink() to copy
    symlinks isn't special casing their code to look for junctions or
    anything else we lump under the banner of islink(). They could code
    defensively if readlink() fails for a 'link' that we can't read. But
    that leaves the problem of readlink() succeeding for a junction. That
    can causes problems if the target is passed to os.symlink(), which
    changes the link from a hard name grafting to a soft name grafting.

    Right, but is that because they deliberately want the junction to be treated like a file? Or because they want it to be treated like the directory is really right there?

    os.rmdir() already does special things to behave like a junction rather than the real directory, and the islink/readlink/symlink process is going to be problematic on Windows since most users can't create symlinks. That code simply isn't going to be portable. But code that is using stat() and expecting to get the real directory ought to work, just as code using lstat() and expecting to get the link if it's been linked somehow ought to work.

    Why would we need to read the target of a junction? It's not needed for
    realpath() in Windows. We should only have to resolve symlinks. For
    example:

    ...

    IMO, S_IFLNK need not be set for anything other than Unix-like symbolic
    links. We would just need to document that on Windows, lstat opens any
    link-like reparse point that indicates it targets another path on the
    system, plus any reparse point that's not handled, but that islink() is
    only true for actual Unix symlinks that can be created via os.symlink()
    and read via os.readlink().

    I think I understand your reasoning here now, sorry for it taking so long.

    This preserves how islink() and readlink() currently work, while still
    leaving the door open to fix misbehavior in particular cases. Code,
    including our own code, that needs to look for the broader Windows
    category of "name surrogate" can examine the reparse tag. For
    convenience we can provide issurrogate() that checks
    lstat(filename).st_reparse_tag & 0x2000_0000.

    I'm not adding new API, even for internal use. This is edge case enough that os.lstat() is fine for it.

    > os.unlink(path): unchanged (still removes the junction, not the
    > contents)

    Whatever we're calling a link should be capable of being deleted via
    os.unlink. If we apply S_IFLNK, then it won't have S_IFDIR (at least how
    POSIX code expects it), and unlink should work on it. The current state
    of affairs in which unlink/remove works on a junction, which is reported
    by stat() as a directory, is inconsistent. It's not specified to remove
    directories, so nothing that it can remove should be a directory.

    I'm proposing to fix the inconsistency by fixing the flags. Your proposal is to fix the inconsistency by generating a new error in unlink()? (Just clarifying.)

    shutil.rmtree(path): Will now remove a junction rather than
    recursively deleting its contents (net improvement, IMHO)

    I'd like for it to remove all name-surrogate directories like CMD's
    rmdir /s does. In contrast, Unix shutil.rmtree traverses into a mount
    point, deletes everything, and then fails because the directory is
    mounted and can't be removed. That's hideous, IMO.

    Currently Windows shutil.rmtree traverses into junctions and deletes everything, though it then succeeds to delete the junction. With my change, rmtree() directly on a junction now raises (could be fixed?) but rmtree on a directory containing a junction will remove the junction without touching the target directory. So I think we're both happy about this one.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 19, 2019

    Here's the requested overview for the case where name-surrogate reparse points are handled as winlinks (Windows links), but S_IFLNK is reserved for IO_REPARSE_TAG_SYMLINK. I took the time this afternoon to write it up in C, which hopefully is clearer than my prose. It handles all CreateFileW failures inline, but uses a recursive call to traverse a non-link. No reparse tag values are hard coded in win32_xstat_impl.

    I extended it to support devices that aren't file systems, such as "con", disk volumes, and raw disks. This enhancement was requested on another issue, but it may as well get updated in this issue if win32_xstat_impl is getting overhauled. Some of these devices are already supported by fstat(). The latter can be extended similarly to support volume devices and raw disk devices.

    I opted to fail the call in the unhandled-tag case if it opens a link that should be traversed. Either it's an unhandled link, which is an unacceptable condition (i.e. it should be a sink, not a link), or it's a link or sequence of links to an unhandled reparse point. Returning the link reparse-point data is not what the caller wants from a successful stat().

    ---

    static int
    win32_xstat_impl(const wchar_t *path, struct _Py_stat_struct *result,
                     BOOL traverse)
    {
        HANDLE hFile;
        BY_HANDLE_FILE_INFORMATION fileInfo;
        FILE_ATTRIBUTE_TAG_INFO tagInfo = { 0 };
        DWORD fileType, error;
        BOOL isUnhandledTag = FALSE;
        int retval = 0;
    
        DWORD access = FILE_READ_ATTRIBUTES;
        DWORD flags = FILE_FLAG_BACKUP_SEMANTICS; /* Allow opening directories. */
        if (!traverse) {
            flags |= FILE_FLAG_OPEN_REPARSE_POINT;
        }
    
        hFile = CreateFileW(path, access, 0, NULL, OPEN_EXISTING, flags, NULL);
        if (hFile == INVALID_HANDLE_VALUE) {
            /* Either the path doesn't exist, or the caller lacks access. */
            error = GetLastError();
            switch (error) {
            case ERROR_ACCESS_DENIED:     /* Cannot sync or read attributes. */
            case ERROR_SHARING_VIOLATION: /* It's a paging file. */
                /* Try reading the parent directory. */
                if (!attributes_from_dir(path, &fileInfo, &tagInfo.ReparseTag)) {
                    /* Cannot read the parent directory. */
                    SetLastError(error);
                    return -1;
                }
                if (fileInfo.dwFileAttributes & FILE_ATTRIBUTE_REPARSE_POINT) {
                    if (traverse ||
                        !IsReparseTagNameSurrogate(tagInfo.ReparseTag)) {
                        /* The stat call has to traverse but cannot, so fail. */
                        SetLastError(error);
                        return -1;
                    }
                }
                break;
    
            case ERROR_INVALID_PARAMETER:
                /* \\.\con requires read or write access. */
                hFile = CreateFileW(path, access | GENERIC_READ,
                            FILE_SHARE_READ | FILE_SHARE_WRITE, NULL,
                            OPEN_EXISTING, flags, NULL);
                if (hFile == INVALID_HANDLE_VALUE) {
                    SetLastError(error);
                    return -1;
                }
                break;
    
            case ERROR_CANT_ACCESS_FILE:
                /* bpo37834: open unhandled reparse points if traverse fails. */
                if (traverse) {
                    traverse = FALSE;
                    isUnhandledTag = TRUE;
                    hFile = CreateFileW(path, access, 0, NULL, OPEN_EXISTING,
                                flags | FILE_FLAG_OPEN_REPARSE_POINT, NULL);
                }
                if (hFile == INVALID_HANDLE_VALUE) {
                    SetLastError(error);
                    return -1;
                }
                break;
    
            default:
                return -1;
            }
        }
    
        if (hFile != INVALID_HANDLE_VALUE) {
    
            /* Query the reparse tag, and traverse a non-link. */
            if (!traverse) {
                if (!GetFileInformationByHandleEx(hFile, FileAttributeTagInfo,
                        &tagInfo, sizeof(tagInfo))) {
                    /* Allow devices that do no support FileAttributeTagInfo. */
                    error = GetLastError() ;
                    if (error == ERROR_INVALID_PARAMETER ||
                        error == ERROR_INVALID_FUNCTION ||
                        error == ERROR_NOT_SUPPORTED) {
                        tagInfo.FileAttributes = FILE_ATTRIBUTE_NORMAL;
                        tagInfo.ReparseTag = 0;
                    } else {
                        retval = -1;
                        goto cleanup;
                    }
                } else if (tagInfo.FileAttributes &
                             FILE_ATTRIBUTE_REPARSE_POINT) {
                    if (IsReparseTagNameSurrogate(tagInfo.ReparseTag)) {
                        if (isUnhandledTag) {
                            /* Traversing previously failed for either this link
                               or its target. */
                            SetLastError(ERROR_CANT_ACCESS_FILE);
                            retval = -1;
                            goto cleanup;
                        }
                    /* Traverse a non-link, but not if traversing already failed
                       for an unhandled tag. */
                    } else if (!isUnhandledTag) {
                        CloseHandle(hFile);
                        return win32_xstat_impl(path, result, TRUE);
                    }
                }
            }
    
            fileType = GetFileType(hFile);
            if (fileType != FILE_TYPE_DISK) {
                if (fileType == FILE_TYPE_UNKNOWN && GetLastError() != 0) {
                    retval = -1;
                    goto cleanup;
                }
                DWORD fileAttributes = GetFileAttributesW(path);
                memset(result, 0, sizeof(*result));
                if (fileAttributes != INVALID_FILE_ATTRIBUTES &&
                    fileAttributes & FILE_ATTRIBUTE_DIRECTORY) {
                    /* \\.\pipe\ or \\.\mailslot\ */
                    result->st_mode = _S_IFDIR;
                } else if (fileType == FILE_TYPE_CHAR) {
                    /* \\.\nul */
                    result->st_mode = _S_IFCHR;
                } else if (fileType == FILE_TYPE_PIPE) {
                    /* \\.\pipe\spam */
                    result->st_mode = _S_IFIFO;
                }
                /* FILE_TYPE_UNKNOWN, e.g. \\.\mailslot\waitfor.exe\spam */
                goto cleanup;
            }
    
            if (!GetFileInformationByHandle(hFile, &fileInfo)) {
                error = GetLastError();
                if (error != ERROR_INVALID_PARAMETER &&
                    error != ERROR_INVALID_FUNCTION &&
                    error != ERROR_NOT_SUPPORTED) {
                    retval = -1;
                    goto cleanup;
                }
                /* Volumes and physical disks are block devices, e.g.
                   \\.\C: and \\.\PhysicalDrive0. */
                memset(result, 0, sizeof(*result));
                result->st_mode = 0x6000; /* S_IFBLK */
                goto cleanup;
            }
        }
    _Py_attribute_data_to_stat(&fileInfo, tagInfo.ReparseTag, result);
    
        if (!(fileInfo.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)) {
            /* Fix the file execute permissions. This hack sets S_IEXEC if
               the filename has an extension that is commonly used by files
               that CreateProcessW can execute. A real implementation calls
               GetSecurityInfo, OpenThreadToken/OpenProcessToken, and
               AccessCheck to check for generic read, write, and execute
               access. */
            const wchar_t *fileExtension = wcsrchr(path, '.');
            if (fileExtension) {
                if (_wcsicmp(fileExtension, L".exe") == 0 ||
                    _wcsicmp(fileExtension, L".bat") == 0 ||
                    _wcsicmp(fileExtension, L".cmd") == 0 ||
                    _wcsicmp(fileExtension, L".com") == 0) {
                    result->st_mode |= 0111;
                }
            }
        }
    
    cleanup:
        if (hFile != INVALID_HANDLE_VALUE) {
            CloseHandle(hFile);
        }
    
        return retval;
    }

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 19, 2019

    Here are two additional differences between mount points and symlinks:

    (1) A mount point in a remote path is always evaluated on the server and restricted to devices that are local to the server. So if we handle a mount point as if it's a POSIX symlink that works with readlink(), then what are we to do with the server's drive "Z:"? Genuine symlinks are evaluated on the client, so readlink() always makes sense. (Though if we resolve a symlink manually, then we're bypassing the system's R2L symlink policy.)

    (2) A mount point has its own security that's checked in addition to the security on the target directory when it's reparsed. In contrast, security set on a symlink is not checked when the link is reparsed, which is why icacls.exe implicitly resolves a symlink when setting and viewing security unless the /L option is used.

    • if it's a directory junction, call os.stat instead and return that > (???)

    I wanted lstat in Windows to traverse mount points by default (but I gave up on this), as it does in Unix, because a mount point behaves like a hard name grafting in a path. This is important for relative symlinks that use ".." components to traverse above their parent directory. The result is different from a directory symlink that targets the same path.

    A counter-argument (in favor of winlinks) is that a mount point is still ultimately a name-surrogate reparse point, so, unlike a hard link, its existence doesn't prevent the directory from being deleted. It's left in place as a dangling link if the target is deleted or the device is removed from the system. Trying to follow it fails with ERROR_PATH_NOT_FOUND or ERROR_FILE_NOT_FOUND.

    Also, handling a mount point as a directory by default would require an additional parameter because in some cases we need to be able to open a junction instead of traversing it, such as to implement shutil.rmtree to behave like CMD's rmdir /s.

    Another place identifying a mount point is required, unfortunately, is in realpath(). Ideally we would be able to handle mount points as just directories. The problem is that NT allows a mount point to target a symlink, something that's not allowed in Unix. Traversing the mount point is effectively the same as traversing the symlink. So we have to read the mount-point target, and if it's a symlink, we have to read and evaluate it. (Consequently it seems that getting the real path for a remote path is an intractable problem when mount points are involved. We can only get the final path.)

    ---

    Even without the addition of a new parameter, we may still want to limit the definition of 'link' in Windows lstat to name-surrogate reparse points, i.e. winlinks. Reparse points that aren't name surrogates don't behave like links. They behave like the file itself, and reparsing may automatically replace the reparse point with the real file. Some of them are even directories that have the directory bit (28) set in the tag value, which means they're allowed to contain other files. (Without the directory tag bit, setting a reparse point on a non-empty directory should fail.)

    The counter-argument to changing lstat to only open winlinks is that changing the meaning of 'link' in lstat is too disruptive to existing software that may depend on the old behavior, i.e. opening any reparse point. I think the use cases for opening non-links are rare enough that it's not beyond the pale to change this behavior in 3.8 or 3.9.

    Right, but is that because they deliberately want the junction
    to be treated like a file? Or because they want it to be treated
    like the directory is really right there?

    For copytree it makes sense to traverse a mount point as a directory. We can't reliably copy a mount point. In Unix, even when a volume mount or bind mount can be detected, there's no standard way to clone it to a new mount point, and even if there were, that would require super-user access. In Windows, we could wrap CreateDirectorExW, which can copy a mount point, but it requires administrator access to copy a volume mount point (i.e. "\\\\?\\Volume{...}\\"), for which it calls SetVolumeMountPointW in order to update the mount-point manager in the kernel.

    We also have a limited ability to create mount points via _winapi.CreateJunction, but it's buggy in corner cases and incomplete. It suffices for the reason it was added -- testing the ability to delete a junction via os.remove().

    os.rmdir() already does special things to behave like a junction
    rather than the real directory,

    This is similar in spirit to Unix, except Unix refuses to delete a mount point. For example, if we have a Unix bind mount to a non-empty directory, rmdir() fails with EBUSY. On the other hand, rmdir() on the real directory fails with ENOTEMPTY. If Unix handled the mount point as if it's just the mounted directory, I'd expect the error to be the same.

    It's not particularly special in Windows unless it's a volume mount point. Then RemoveDirectoryW tries to call DeleteVolumeMountPointW. This could be a case where it would fail to remove a mount point, just like Unix. But the internal DeleteVolumeMountPointW call is allowed to fail if the caller doesn't have access to update the mount-point manager, in which case it removes the junction anyway.

    The consequence of failing to update the mount-point manager is that GetFinalPathNameByHandleW calls will subsequently return a non-existing path for a volume that was mounted only in the deleted folder (i.e. the volume isn't also assigned a drive letter). Thus we can't assume the result from GetFinalPathNameByHandleW exists. This just pertains to volume mount points, which are special to the mount-point manager because it uses them to translate a native device path into a canonical DOS path. Bind mount points have no special significance to the mount-point manager.

    the islink/readlink/symlink process is going to be problematic on
    Windows since most users can't create symlinks.

    Then copying the symlink fails, which I think is better than silently transforming the behavior from a mount point to a symlink. Defensive code can fall back on physically copying the target file or directory.

    The latter is the default behavior for copytree. It's only an issue if code calls copytree(src, dst, symlinks=True).

    However, it's always a concern with shutil.move(), which attempts to move a file via os.rename. This fails for a cross-volume rename. Then if islink() is true, it falls back on os.symlink(os.readlink(src), real_dst) and os.unlink(src).

    (On my own systems, I grant the symlink privilege to the Authenticated Users group, which allows symlink creation by standard users and administrators -- elevated or not. But in general, a fear of symlinks is warranted, even in Unix.)

    I'm proposing to fix the inconsistency by fixing the flags. Your
    proposal is to fix the inconsistency by generating a new error in
    unlink()? (Just clarifying.)

    unlink() didn't used to remove junctions prior to 3.5 (see bpo-18314). Instead of rolling back the change, or conflating the meaning of S_IFLNK, a counter-proposal is to harmonize unlink with the proposed change to lstat, i.e. to allow removing all name-surrogate directories. A name-surrogate directory cannot have children in the directory itself, so allowing it for os.unlink is in the spirit of the function, even if doing so is inconsistent with the literal specification.

    This is documented in ntifs.h:

    D [bit 28] is the directory bit. When set to 1, indicates that any
    directory with this reparse tag can have children. Has no special
    meaning when used on a non-directory file. Not compatible with the
    name surrogate bit [bit 29].
    

    Regarding the directory bit, the registered tags with this bit are IO_REPARSE_TAG_CLOUD*, IO_REPARSE_TAG_WCI_1, and IO_REPARSE_TAG_PROJFS (for projected file systems).

    Currently Windows shutil.rmtree traverses into junctions and deletes
    everything, though it then succeeds to delete the junction.

    That's like Unix mount-point behavior, except Windows allows a volume mount point to be deleted (not just a bind mount point), despite negative consequences to API functions such as GetFinalPathNameByHandleW if the user isn't allowed to update the system database of volume mount points.

    An issue here, and with all code that walks a tree (especially destructively), is the link behavior of mount points. Bind mount points have the same problem in both Unix and Windows. For example, shutil.rmtree will fail to remove a mount point that targets a directory that it already removed. It's a different OSError in Unix vs Windows (EBUSY vs ENOENT or ERROR_PATH_NOT_FOUND), but an error all the same. That in itself is not an argument to handle a junction as a symlink, because it's still a mount point that behaves as such, even if someone is using it as a symlink. However, it is an argument for special handling of winlinks, which would allow the Windows implementation to behave better than Unix, IMO, in addition to helping Windows users that are forced to use mount points instead of symlinks.

    With my change, rmtree() directly on a junction now raises (could be
    fixed?) but rmtree on a directory containing a junction will remove
    the junction without touching the target directory. So I think we're
    both happy about this one.

    Changing rmtree to work on a target directory that claims to be a symlink would require special casing Windows in shutil.rmtree. But in general this is a problem that affects all code that looks for symlinks, not just code in the standard library.

    If the meaning of S_IFLNK remains the same, then existing code has the option of being upgraded to delete directory winlinks without traversing them, but nothing is forced on them. In this case, for example, we could wrap the os.scandir call:

        if not _WINDOWS:
            _rmtree_unsafe_scandir = os.scandir
        else:
            import contextlib
    
            def _rmtree_unsafe_scandir(path):
                try:
                    st = os.lstat(path)
                    attr, tag = st.st_file_attributes, st.st_reparse_tag
                except OSError:
                    attr = tag = 0
                if (attr & stat.FILE_ATTRIBUTE_DIRECTORY
                      and attr & stat.FILE_ATTRIBUTE_REPARSE_POINT
                      and tag & 0x2000_0000): # IsReparseTagNameSurrogate
                    return contextlib.nullcontext([])
                else:
                    return os.scandir(path)

    For a directory winlink, the above _rmtree_unsafe_scandir function returns a context manager that yields an empty list, so _rmtree_unsafe skips to os.rmdir(path). This reproduces the behavior of CMD's rmdir /s, which will not traverse any name-surrogate reparse point (it checks the tag for the name-surrogate bit) even if the reparse point is the target directory.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 19, 2019

    Thanks for the code snippet, that helped me a lot (and since you went to the trouble of fixing other bugs, I guess I'll have to merge it into my PR now).

    Any particular reason you did GetFileAttributesW(path) in the non-FILE_TYPE_DISK case when we have the hFile around?

    I'm trying to get one more opinion from a colleague on setting S_IFLNK for all name surrogate reparse points vs. only symlinks by default (the Python/fileutils.c change, and implicitly the fixes to Lib/shutil.py). I might try and get some broader opinions as well on whether "is_dir() is true, do you suspect it could be a junction" vs "is_link() is true, do you suspect it could be a junction", since that is what it really comes down to. (We need to make changes to shutil to match Explorer anyway - rmtree should not recurse, and copytree should.)

    However, the minimal change is to leave S_IFLNK only for symlinks, so unless I get a strong case for treating all name surrogates as links, I'll revert to that.

    @eryksun
    Copy link
    Contributor

    eryksun commented Aug 19, 2019

    Any particular reason you did GetFileAttributesW(path) in the
    non-FILE_TYPE_DISK case when we have the hFile around?

    The point of getting the file attributes is to identify the root directory of the namedpipe and mailslot file systems. For example, os.listdir('//./pipe') works because we append "\\*.*" to the path.

    GetFileInformationByHandle[Ex] forbids a pipe handle, for reasons that may no longer be relevant in Windows 10 (?). I remembered the restriction in the above case, but it seems I forgot about it when querying the tag. So the order of the GetFileInformationByHandleEx and GetFileType blocks actually needs to be flipped. That would be a net improvement anyway since there's no point in querying a reparse tag from a device that's not a file system (namedpipe and mailslot are 'file systems', but only at the most basic level).

    I can't imagine there being a problem with querying FileBasicInfo to get the file attributes. I just checked that it works fine with "//./pipe/" and "//./mailslot/" -- at least in Windows 10. Anyway, GetFileAttributesW uses a query-only open that doesn't create a real file object or even require an IRP usually, so it's not adding much cost compared to querying FileBasicInfo using the handle.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 19, 2019

    So the order of the GetFileInformationByHandleEx and GetFileType blocks actually needs to be flipped.

    I've done that now.

    And thanks for confirming. That was my suspicion, but I wasn't sure if you knew something here that I didn't (v. likely!).

    @zooba
    Copy link
    Member Author

    zooba commented Aug 20, 2019

    The latest PR also fixes bpo-1311 and bpo-20541 properly (os.path.exists("NUL") now returns True).

    @zooba
    Copy link
    Member Author

    zooba commented Aug 21, 2019

    So my colleagues confirmed that they deliberately represent junction points as symlinks within WSL, including translating the target to the mounted location (assuming it is mounted) and letting the Linux code traverse it normally. They also said they haven't heard any feedback suggesting it causes any trouble.

    However, the more I've thought about the implications of islink() returning true for a junction, the more I've come around to Eryk's point of view, so I'm going to merge this as is (the PR currently only sets S_IFLNK for actual symlinks).

    That said, nt.readlink() will still be able to read the target of a junction, and code like I have in PR 15287 that uses readlink() to follow links will resolve them (noting that that implementation of realpath() attempts to let the OS follow it first). I think that covers the intended use cases best.

    The only updates left are a couple of docs here, but I'll finish up bpo-9949 first and rebase on those changes as well.

    @zooba
    Copy link
    Member Author

    zooba commented Aug 21, 2019

    New changeset df2d4a6 by Steve Dower in branch 'master':
    bpo-37834: Normalise handling of reparse points on Windows (GH-15231)
    df2d4a6

    @zooba
    Copy link
    Member Author

    zooba commented Aug 21, 2019

    New changeset 9eb3d54 by Steve Dower in branch '3.8':
    bpo-37834: Normalise handling of reparse points on Windows (GH-15370)
    9eb3d54

    @zooba
    Copy link
    Member Author

    zooba commented Aug 22, 2019

    Adding a small fix for the Win7 buildbots in PR 15377

    @zooba
    Copy link
    Member Author

    zooba commented Aug 22, 2019

    New changeset 374be59 by Steve Dower in branch 'master':
    bpo-37834: Fix test on Windows 7 (GH-15377)
    374be59

    @miss-islington
    Copy link
    Contributor

    New changeset 967d625 by Miss Islington (bot) in branch '3.8':
    bpo-37834: Fix test on Windows 7 (GH-15377)
    967d625

    @zooba
    Copy link
    Member Author

    zooba commented Aug 22, 2019

    That should be all the buildbot issues fixes, so I'm marking this resolved and will wait for the inevitable 3.8.0b4 feedback!

    @zooba zooba closed this as completed Aug 22, 2019
    @ambv
    Copy link
    Contributor

    ambv commented Aug 29, 2019

    There's a bug on macOS that is blocking the release regarding stat.FILE_ATTRIBUTE_REPARSE_POINT being used to check whether os.stat_result objects have the st_file_attributes attribute.

    @ambv ambv reopened this Aug 29, 2019
    @ambv
    Copy link
    Contributor

    ambv commented Aug 29, 2019

    Unit tests didn't catch it since it fails on older macOS releases.

    @ned-deily
    Copy link
    Member

    One problem seems to be that the code added for this issue assumes that the documentation is correct in implying that the stat.FILE_ATTRIBUTE_* constants (like stat.FILE_ATTRIBUTE_REPARSE_POINT) are only present on Windows. But besides being conditionally created in _stat.c, they are also undconditionally defined in stat.py on all platforms. That makes some of the tests in shutil.py, like:
    if hasattr(stat, 'FILE_ATTRIBUTE_REPARSE_POINT'):
    to determine which versions of _rmtree_islink and _rmtree_isdir to define problematic.

    @ned-deily
    Copy link
    Member

    ... and the other important difference is that older versions of macOS do not support fd functions so _use_fd_functions is false and the alternate path is taken in shutil.rmtree, the path that calls _rm_tree_islink which fails.

    >>> shutil.rmtree('a')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/shutil.py", line 718, in rmtree
        if _rmtree_islink(path):
      File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/shutil.py", line 564, in _rmtree_islink
        (st.st_file_attributes & stat.FILE_ATTRIBUTE_REPARSE_POINT
    AttributeError: 'os.stat_result' object has no attribute 'st_file_attributes'

    @zooba
    Copy link
    Member Author

    zooba commented Aug 29, 2019

    Huh, didn't realise those were always defined in stat.py.

    Changing the test to "hasattr(... "st_file_attributes")" should be fine. I can get to it in a couple of hours if nobody else gets there first.

    @ambv
    Copy link
    Contributor

    ambv commented Aug 29, 2019

    New changeset 7fcc208 by Łukasz Langa (Ned Deily) in branch 'master':
    bpo-37834: Prevent shutil.rmtree exception (GH-15602)
    7fcc208

    @ambv
    Copy link
    Contributor

    ambv commented Aug 29, 2019

    New changeset 25a044e by Łukasz Langa in branch '3.8':
    [3.8] bpo-37834: Prevent shutil.rmtree exception (GH-15602) (bpo-15603)
    25a044e

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes OS-windows
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants