This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: pathlib.WindowsPath.reslove(strict=False) returns absoulte path only if at least one component exists
Type: behavior Stage: resolved
Components: Library (Lib), Windows Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: pathlib.Path.resolve(strict=False) returns relative path on Windows if the entry does not exist
View: 38671
Assigned To: Nosy List: eryksun, mliska, paul.moore, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2017-12-27 20:13 by mliska, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (4)
msg309102 - (view) Author: Martin Liska (mliska) Date: 2017-12-27 20:13
The documentation for pathlib.Path.resolve says: "Make the path absolute, resolving any symlinks."

On Windows, the behavior doesn't always match the first part of the statement.

Example:
On a system with an existing, but empty directory C:\Test.
Running the interpreter at C:\ resolve behaves like so:

>>> os.path.realpath(r'Test\file')
'C:\\Test\\file'
>>> WindowsPath(r'Test\file').resolve(strict=False)
WindowsPath('C:/Test/file')

When running the interpreter inside C:\Test it instead behaves in the following manner:

>>> os.path.realpath('file')
'C:\\Test\\file'
>>> WindowsPath('file').resolve(strict=False)
WindowsPath('file')

Resolving a path object specifying a non-existent relative path results in an identical (relative) path object.
This is also inconsistent with the behavior of os.path.realpath as demonstrated.

The root of the issue is in the pathlib._WindowsFlavour.resolve method at lines 193, 199 and 201.
If at least one component of the path gets resolved at line 193 by the expression self._ext_to_normal(_getfinalpathname(s)), the path returned at line 201 will be joined from the absolute, resolved part and the unresolved remained.
If none of the components get resolved then the path will be returned at line 199 as passed into the function.
msg309164 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2017-12-29 08:39
If none of the components of a relative path exist, then splitting off the head returns an empty string in the second-to-last pass through the while loop. In the last pass, _getfinalpathname("") raises FileNotFoundError, and ultimately resolve() ends up returning `path`. To avoid this, the empty string needs to be replaced with a ".". Then the last pass can resolve the working directory.
msg309165 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2017-12-29 08:43
resolve() has additional problems, which possibly could be addressed all at once because it's a small method and the problems are closely related.

For an empty path it returns os.getcwd(). I don't think this case is possible in the normal way a Path gets constructed. Anyway, returning the unresolved working directory is incorrect. Windows is not a POSIX OS. WinAPI [Set,Get]CurrentDirectory does not ensure the working directory is a resolved path. An empty path should be replaced with "." and processed normally.

It fails when access is denied. _getfinalpathname needs to open a handle, and CreateFile requires at least the right to read file attributes and synchronize. For example, resolve() fails for a path in another user's profile, which grants access only to the user, system, and administrators.

It doesn't keep the "\\\\?\\" extended-path prefix when the source path already has it. It should only strip the prefix if the source path doesn't have it. That said, the resolved path may actually require it (i.e. long paths, DOS device names, trailing spaces), so maybe it should  never be removed.

Its behavior is inconsistent for invalid paths. For example, if "C:/Temp" exists and "C:/Spam" does not, then resolving "C:/Temp/bad?" raises whereas "C:/Spam/bad?" does not. IMO, it's simpler if neither raises in non-strict mode, at least not in _WindowsFlavour.resolve. Malformed paths will slip through, but raising an exception in those cases should be the job of the constructor. 

It fails to handle device paths. For example, C:/Temp/nul" resolves to the "NUL" device if "C:/Temp" exists. In this case _getfinalpathname will typically fail, either due to an invalid function or an invalid parameter. These errors, along with the error for an invalid filename, get lumped into the CRT's default EINVAL error code.

Also, GetFinalPathNameByHandle was added in Vista, so _getfinalpathname is always available in 3.5+. There's no need to use it conditionally.

Here's a prototype that addresses these issues:

    import nt
    import os
    import errno

    def _is_extended(path):
        return path.startswith('\\\\?\\')

    def _extended_to_normal(path):
        if _is_extended(path):
            path = path[4:]
            if path.startswith('UNC\\'):
                path = '\\' + path[3:]
        return path

    def _getfinalpathname(path):
        if not path:
            path = '.'
        elif _is_extended(path):
            return nt._getfinalpathname(path)
        return _extended_to_normal(nt._getfinalpathname(path))

    def resolve(path, strict=False):
        s = str(path)
        if strict:
            return _getfinalpathname(s)
        # Non-strict mode resolves as much as possible while retaining
        # tail components that cannot be resolved if they're missing,
        # inaccessible, or invalid.
        tail_parts = []
        while True:
            try:
                s = _getfinalpathname(s)
                break
            except OSError as e:
                if not (isinstance(e, (FileNotFoundError, PermissionError)) or
                        e.errno == errno.EINVAL):
                    raise
            head, tail = os.path.split(s)
            if head == s:
                return path.absolute()
            s = head
            tail_parts.append(tail)
        if tail_parts:
            s = os.path.join(s, *reversed(tail_parts))
        return s
msg388799 - (view) Author: Eryk Sun (eryksun) * (Python triager) Date: 2021-03-16 02:27
bpo-38671 has PR 17716 pending approval, which addresses the problem in msg309102 by ensuring that a non-strict resolve begins by getting the absolute path via nt._getfullpathname().
History
Date User Action Args
2022-04-11 14:58:56adminsetgithub: 76615
2021-03-16 02:27:08eryksunsetstatus: open -> closed
superseder: pathlib.Path.resolve(strict=False) returns relative path on Windows if the entry does not exist
messages: + msg388799

resolution: duplicate
stage: needs patch -> resolved
2017-12-29 08:43:19eryksunsetmessages: + msg309165
2017-12-29 08:39:07eryksunsetversions: + Python 3.7
nosy: + paul.moore, tim.golden, eryksun, zach.ware, steve.dower

messages: + msg309164

components: + Windows
stage: needs patch
2017-12-27 20:13:36mliskacreate