Author eryksun
Recipients eryksun, pablogsal, ronaldoussoren
Date 2020-07-28.03:37:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1595907441.6.0.793064137987.issue41355@roundup.psfhosted.org>
In-reply-to
Content
I'm trying to give os.link() and follow_symlinks the benefit of the doubt, but the implementation just seems buggy to me. 

POSIX says that "[i]f path1 names a symbolic link, it is implementation-defined whether link() follows the symbolic link, or creates a new link to the symbolic link itself" [1]. In Linux, link() does not follow symlinks. One has to call linkat() with AT_SYMLINK_FOLLOW:

    AT_SYMLINK_FOLLOW (since Linux 2.6.18)
        By default, linkat(), does not dereference oldpath if it is a 
        symbolic link (like link()). The flag AT_SYMLINK_FOLLOW can be
        specified in flags to cause oldpath to be dereferenced if it is
        a symbolic link. 

The behavior is apparently the same in FreeBSD [2]. 

Thus the following implementation in os.link() seems buggy.

#ifdef HAVE_LINKAT
    if ((src_dir_fd != DEFAULT_DIR_FD) ||
        (dst_dir_fd != DEFAULT_DIR_FD) ||
        (!follow_symlinks))
        result = linkat(src_dir_fd, src->narrow,
            dst_dir_fd, dst->narrow,
            follow_symlinks ? AT_SYMLINK_FOLLOW : 0);
    else
#endif /* HAVE_LINKAT */

The only way that the value of follow_symlinks matters in Linux is if src_dir_fd or dst_dir_fd is used with a real file descriptor (i.e. not DEFAULT_DIR_FD, which is AT_FDCWD). Otherwise, the default True value of follow_symlinks is an outright lie. For example:

    >>> os.link in os.supports_follow_symlinks
    True
    >>> open('spam', 'w').close()
    >>> os.symlink('spam', 'spamlink1')
    >>> os.link('spamlink1', 'spamlink2')

spamlink2 was created as a hardlink to spamlink1, not its target, i.e. it's a symlink:
 
    >>> os.lstat('spamlink1').st_ino == os.lstat('spamlink2').st_ino
    True
    >>> os.readlink('spamlink2')
    'spam'

In contrast, if src_dir_fd is passed, then follow_symlinks=True is implemented as advertised (via AT_SYMLINK_FOLLOW):

    >>> fd = os.open('.', 0)
    >>> os.link('spamlink1', 'spamlink3', src_dir_fd=fd)

spamlink3 was created as a hardlink to spam, the target of spamlink1:
  
    >>> os.lstat('spam').st_ino == os.lstat('spamlink3').st_ino
    True

That the value of an unrelated parameter -- src_dir_fd -- changes the behavior of the follow_symlinks parameter is obviously a bug that should be addressed.

POSIX mandates that "[i]f both fd1 and fd2 have value AT_FDCWD, the behavior shall be identical to a call to link(), except that symbolic links shall be handled as specified by the value of flag". It's already using AT_FDCWD as a default value, so the implementation of os.link() should just unconditionally call linkat() if it's available. Then the value of follow_symlinks, true or false, will be honored, with or without passing src_dir_fd or dst_dir_fd.

That said, since os.link() hasn't been working as advertised, this change needs to be accompanied by changing the default value of follow_symlinks to False. That will retain the status quo behavior for most systems, except in the rare case that src_dir_fd or dst_dir_fd is used. If it isn't changed to False, then suddenly os.link() calls will start following symlinks, whereas prior to the change they did not because link() was being called instead of linkat(). 

--- 

In Windows, CreateHardLinkW [3] is incorrectly documented as following symlinks (i.e. "[i]f the path points to a symbolic link, the function creates a hard link to the target"). Actually, it opens the file to be hard-linked with the NTAPI option FILE_OPEN_REPARSE_POINT (same as WinAPI FILE_FLAG_OPEN_REPARSE_POINT). Thus no type of reparse point is followed, including symlinks.

---

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html
[2]: https://www.unix.com/man-page/FreeBSD/2/link
[3]: https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createhardlinkw
History
Date User Action Args
2020-07-28 03:37:21eryksunsetrecipients: + eryksun, ronaldoussoren, pablogsal
2020-07-28 03:37:21eryksunsetmessageid: <1595907441.6.0.793064137987.issue41355@roundup.psfhosted.org>
2020-07-28 03:37:21eryksunlinkissue41355 messages
2020-07-28 03:37:20eryksuncreate