classification
Title: Please add race-free os.link and os.symlink wrapper / helper
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Michael.Felt, Tom Hale, a.badger, eryksun, rbcollins, serhiy.storchaka, taleinat
Priority: normal Keywords: patch

Created on 2019-04-18 08:44 by Tom Hale, last changed 2019-06-29 13:53 by Tom Hale.

Pull Requests
URL Status Linked Edit
PR 14464 open python-dev, 2019-06-29 13:43
Messages (14)
msg340474 - (view) Author: Tom Hale (Tom Hale) * Date: 2019-04-18 08:44
I cannot find a race-condition-free way to force overwrite an existing symlink.

os.symlink() requires that the target does not exist, meaning that it could be created via race condition the two workaround solutions that I've seen:

1. Unlink existing symlink (could be recreated, causing following symlink to fail)

2. Create a new temporary symlink, then overwrite target (temp could be changed between creation and replace.

The additional gotcha with the safer (because the attack filename is unknown) option (2) is that replace() may fail if the two files are on separate filesystems.

I suggest an additional `force=` argument to os.symlink(), defaulting to `False` for backward compatibility, but allowing atomic overwriting of a symlink when set to `True`.

I would be willing to look into a PR for this.

Prior art:  https://stackoverflow.com/a/55742015/5353461
msg340526 - (view) Author: Tom Hale (Tom Hale) * Date: 2019-04-19 07:40
The most correct work-around I believe exists is:

(updates at: https://stackoverflow.com/a/55742015/5353461)

    def symlink_force(target, link_name):
        '''
        Create a symbolic link pointing to target named link_name.
        Overwrite target if it exists.
        '''
    
        # os.replace may fail if files are on different filesystems.
        # Therefore, use the directory of target
        link_dir = os.path.dirname(target)
    
        # os.symlink requires that the target does NOT exist.
        # Avoid race condition of file creation between mktemp and symlink:
        while True:
            temp_pathname = tempfile.mktemp(suffix='.tmp', \
                            prefix='symlink_force_tmp-', dir=link_dir)
            try:
                os.symlink(target, temp_pathname)
                break  # Success, exit loop
            except FileExistsError:
                time.sleep(0.001)  # Prevent high load in pathological conditions
            except:
                raise
        os.replace(temp_pathname, link_name)

An unlikely race condition still remains: the symlink created at the randomly-named `temp_path` could be modified between creation and rename/replacing the specified link name.

Suggestions for improvement welcome.
msg340546 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-04-19 14:37
If the symlink can be recreated, it can also be changed after creation.
msg341373 - (view) Author: Tom Hale (Tom Hale) * Date: 2019-05-04 05:45
Yes, but by default (because of difficulty) people won't check for this case:

1. I delete existing symlink in order to recreate it
2. Attacker watching symlink finds it deleted and recreates it
3. I try to create symlink, and an unexpected exception is raised
msg341376 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-05-04 06:44
So what? Detected problem is better than non-detected problem. If and unexpected exception causes troubles in your code, it is up to you what to do with it: silence it, terminate an application, try to recreate a symlink in other place, etc. In any case this will not solve bigger problem that you have: attacker is able to change your symlinks.
msg341753 - (view) Author: Toshio Kuratomi (a.badger) * Date: 2019-05-07 15:32
Additionally, the os module is supposed to closely follow the behaviour of the underlying operating system functions: https://docs.python.org/3/library/os.html  

> The design of all built-in operating system dependent modules of Python is such that as long as the same functionality is available, it uses the same interface; [..]

The POSIX symlimk function on which this is based has made the decision not to overwrite an existing symlink (See the EEXIST error in https://pubs.opengroup.org/onlinepubs/009695399/functions/symlink.html or man pages on symlink from one of the Linux distros: http://man7.org/linux/man-pages/man2/symlink.2.html )   As with many other POSIX-derived filesystem functions, the technique you propose, relying on atomic filesystem renames) would seem to be the standard method of writing race-resistant code.  Due to the mandate for the os module, it feels like that belongs in a utility function in custom code or another module rather than something for the os module.

A couple of thoughts on what you could do instead:

* A collection of utility functions that fixed race-conditions in filesystem handling could make a nice third party module on pypi.

* The stdlib shutil module provides an API that's supposed to be easier to implement common use cases than the os.* functions.  Perhaps you could propose your idea to the python-ideas mailing list as a new function in that module and see what people think of that?
msg342434 - (view) Author: Tom Hale (Tom Hale) * Date: 2019-05-14 07:01
Thanks Toshio Kuratomi, I raised it on the mailing list at:

https://code.activestate.com/lists/python-ideas/55992/
msg344408 - (view) Author: Tom Hale (Tom Hale) * Date: 2019-06-03 10:36
Serhiy wrote

> Detected problem is better than non-detected problem.

I agree. I also assert that no problem (via a shutil wrapper) is better than a detected problem which may not be handled.

While it's up to the programmer to handle exceptions, it's only systems programmers who will realise that there is a sometimes-occurring exception which should be handled.

So most people won't handle it. They don't know that they need to.

The shutil os.link and os.symlink wrappers under discussion on python-ideas prevent non-system-programmers from needing to know about the race-condition case (and correctly handling it).

If something is difficult to get right, then let's put it in shutil and save everyone reinventing the wheel.

3 specific cases in which an atomic link replacement would be useful are listed here:

https://code.activestate.com/lists/python-ideas/56195/
msg344501 - (view) Author: Robert Collins (rbcollins) * (Python committer) Date: 2019-06-04 00:13
I'd like to add a few notes; please do consider Windows interactions here - Windows does not have the same model for inode replacement that Posix has.

Secondly, I note that the thread model discussed here hasn't been particular well articulated. In particular the vagaries of directories with the sticky bit set are quite different to those where attacker and victim share group permissions : TOCTTOU attacks do imply that post-atomic operation attacks will be possible, and I disagree with the analysis by Serhiy suggesting that they are necessarily so.

However I also agree with Toshio that the os module is not the right place to provide a more secure API: we have to have the basic primitive exposed to Python code somewhere, otherwise the higher level APIs such as you'd like to use are not creatable.

My suggestion is that you put a module up on PyPI that provides the secure behaviour necessary, get some feedback on that, get some cross-platform experience, and then, if desired, propose it for inclusion in shutil or similar in the stdlib.

I'm also going to retitle this - because actually, os.link and os.symlink aren't racy *per se* on Posix - its how they are used that is racy: and the very fact that a secure variant can be written (however ugly) is demonstration of that.
msg344575 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2019-06-04 14:14
> I'd like to add a few notes; please do consider Windows interactions 
> here - Windows does not have the same model for inode replacement that 
> Posix has.

Certainly Windows doesn't have the same model for inode replacement since it doesn't have inodes. ;-)

Like POSIX, we can't create a symlink and replace it in one step in Windows. A reparse point can only be set on an existing file. WINAPI CreateSymbolicLink thus requires two system calls: NtCreateFile (file or directory) and NtFsControlFile: FSCTL_SET_REPARSE_POINT. Sensibly, CreateSymbolicLink does not support a flag to allow replacing an existing file or empty directory, since overall the operation isn't atomic.

Like POSIX, we can do an atomic rename within a volume device. This supports replacing an existing file, but, for backward compatibility, we can't implement the replace semantics in os.rename in Windows. Instead we have os.replace. This is a single NtSetInformatonFile: FileRenameInformation system call, with ReplaceIfExists enabled. An important caveat is that this request cannot replace an open file, even if it was opened with delete-access sharing, since that would leave the replaced file nameless. To partially address this, NTFS (but not FAT32) in Windows 10 1709 supports a new FileRenameInformationEx request. This request can combine the flags FILE_RENAME_REPLACE_IF_EXISTS and FILE_RENAME_POSIX_SEMANTICS to allow replacing an open file, if and only if it's opened with delete-access sharing. The replaced file is still accessible for read and write data operations until it's closed. The requirement for delete-access sharing means this is still not quite POSIX semantics. (Most Windows programs do not open files with delete sharing, except for temp files.) But at least it's possible for aware programs and environments to implement POSIX semantics. 

Regarding os.link, WINAPI CreateHardLink is implemented via NtSetInformationFile: FileLinkInformation. This system call also supports ReplaceIfExists, but CreateHardLink doesn't expose an option or flag for this capability. Even if we supported the system call directly (unprecedented in Windows Python), the same problem exists with open files. NTFS in Windows 10 1809 supports a new FileLinkInformationEx request. The flags FILE_LINK_REPLACE_IF_EXISTS and FILE_LINK_POSIX_SEMANTICS behave as the corresponding flags for FileRenameInformationEx.

It's unrelated, but just to round this discussion out, NTFS in Windows 10 1709 supports a new FileDispositionInformationEx request to set a file's delete disposition. With the flag FILE_DISPOSITION_POSIX_SEMANTICS, the deleted file is unlinked as soon as the requesting file reference is closed, even if the file has other references. This is in contrast to the classic FileDispositionInformation request for which the file is only unlinked once all references are closed. Again, it's not quit POSIX semantics overall since the shared delete access model still applies. But this new delete operation at least addresses the problem with malware scanners and similar programs that open files with shared delete access.
msg344911 - (view) Author: Robert Collins (rbcollins) * (Python committer) Date: 2019-06-07 09:09
Thank you @Eryk
msg344915 - (view) Author: Michael Felt (Michael.Felt) * Date: 2019-06-07 09:39
I started a reply on the python-mentoring maillist - and promised to come back here.

a) imho - the discussion about an "attacker" is only misleading for the general case. For active attacks - where an attacker has active acces to the system is not something that is the responsibility of "stdlib". On POSIX, one of the first things I would consider is adding the "STX" bit so that only root and the file owner can remove an existing "link" to an inode (a symbolic link is a "special" file whose contents is the PATH to yet another filesystem "name" regardless of it's type.) For reference, a hard-link "merely" increases the number of links to an inode - and so only works for non-directory objects within the same filesystem). Puf!

b) the "real", again imho, security issue is one of system integrity. Compare this "race" condition with a multi-user database (think of directories as the "index" to a data-record. A user that trusts the file-system index to refer to the correct file-system object (file, directory, other special file) - especially when there is no way for the user to verify the accuracy of the symbolic link. Any application, especially if it runs as "root" needs to be extremely cautious with "overwriting" existing "records". It is asif as database admin just changes where one database record references - as is, I cannot change the data it references - so I just change what it goes to - where I can change the data. -- As I think about it - an application that is dependent on symbolic links has an inherent weakness.

In short - I would be against this - if security is the primary argument for developing it. I would rather be notified that there is an existing symbolic link - where there should not be - and consider later action.

a) an application, running as root, can override any security measure I can come up with. The is the core "weakness" of UNIX/POSIX access control. When not running as root - there are mechanisms that can block the removal of any directory entry (the unlink() call).

b) so, if you want to consider adding a "user-friendly" force option, just as POSIX ln does (whether for hard or soft links) - then that is, imho - a different discussion. I would tend to be "against" making it too easy, because an application coded with "Force=true" would never get the feedback that something already exists. If the currect implementation returns the a "fatal" error - that needs to be fixed.

In short, I do not think it is the task of Python to rewrite the behavior of the system calls being used - and/because I fear lazy programmers (being there, done that!).

"ls -sf xxx yyy" and/or "ln -f xxx yyy" is a convenience. Using it as a default is "bad-practice" (imho) - and I fear programmers would use this new "feature" and move to "bad practice"

Michael
msg346717 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2019-06-27 08:35
I think there is a general flaw with the suggested implementation approach.

An infinite loop as currently suggested seems dangerous to me.  For example, it could cause a program to stall indefinitely in some unforeseen edge-case.  I don't think this would be acceptable for something in the stdlib.

On the other hand, limiting this loop to a finite number of attempts would re-introduce the necessity of handling the case where creating the symlink fails. Users of the function would still need to deal with this case (or risk unhandled exceptions).

In either case, this wouldn't give users the simplicity and peace of mind that a truly atomic operation would.  ISTM that only using actually atomic operations, as provided by the underlying file system and/or OS, would allow us to provide that.

Given the above, I must say that I don't see this providing enough utility to justify inclusion in the stdlib, unless a better implementation approach can be found.
msg346882 - (view) Author: Tom Hale (Tom Hale) * Date: 2019-06-29 13:53
I've created a PR here:

https://github.com/python/cpython/pull/14464

Only shutil.symlink is currently implemented.

Feedback sought from Windows users.

@Michael.Felt please note that `overwrite=False` is the default.

@taleinat I hope that the new implementation addresses the infinite loop concern.

Please be both thorough and gentle - this is my first PR.
History
Date User Action Args
2019-06-29 13:53:53Tom Halesetmessages: + msg346882
2019-06-29 13:43:58python-devsetkeywords: + patch
stage: patch review
pull_requests: + pull_request14280
2019-06-27 08:35:41taleinatsetnosy: + taleinat
messages: + msg346717
2019-06-27 08:26:48taleinatsetcomponents: + Library (Lib)
versions: + Python 3.9, - Python 3.7
2019-06-07 09:39:45Michael.Feltsetmessages: + msg344915
2019-06-07 09:09:29rbcollinssetmessages: + msg344911
2019-06-07 09:04:19Michael.Feltsetnosy: + Michael.Felt
2019-06-04 14:14:42eryksunsetnosy: + eryksun
messages: + msg344575
2019-06-04 00:13:26rbcollinssetnosy: + rbcollins

messages: + msg344501
title: Race conditions due to os.link and os.symlink POSIX specification -> Please add race-free os.link and os.symlink wrapper / helper
2019-06-03 10:36:04Tom Halesetmessages: + msg344408
2019-06-03 10:02:11Tom Halesettitle: Allow os.symlink(src, target, force=True) to prevent race conditions -> Race conditions due to os.link and os.symlink POSIX specification
2019-05-14 10:04:06Tom Halesettype: security -> enhancement
2019-05-14 07:01:37Tom Halesetmessages: + msg342434
2019-05-07 15:32:39a.badgersetnosy: + a.badger
messages: + msg341753
2019-05-04 06:44:36serhiy.storchakasetmessages: + msg341376
2019-05-04 05:45:38Tom Halesetmessages: + msg341373
2019-04-19 14:37:26serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg340546
2019-04-19 07:40:27Tom Halesettype: security
messages: + msg340526
2019-04-18 08:44:47Tom Halecreate