classification
Title: resolve() fails when the path doesn't exist
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: 19887 Superseder:
Assigned To: steve.dower Nosy List: Philip Ridout, brett.cannon, davide.rizzo, july, mwagner, neologix, pitrou, python-dev, serhiy.storchaka, steve.dower, vajrasky
Priority: deferred blocker Keywords: patch

Created on 2013-11-22 17:45 by pitrou, last changed 2017-06-06 15:44 by gvanrossum. This issue is now closed.

Files
File name Uploaded Description Edit
add_non_strict_resolve_pathlib.patch vajrasky, 2013-12-03 07:55 review
add_non_strict_resolve_pathlib_v2.patch vajrasky, 2013-12-10 08:58 review
add_non_strict_resolve_pathlib_v3.patch vajrasky, 2013-12-11 08:23 review
add_non_strict_resolve_pathlib_v4.patch vajrasky, 2013-12-17 02:54 review
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017-03-31 16:36
PR 1649 Dormouse759, 2017-05-28 13:44
Messages (25)
msg203819 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-11-22 17:45
Currently Path.resolve() raises FileNotFoundError when the path does not exist. Guido pointed out that it may more useful to resolve the path components until one doesn't exist, and then return the rest unchanged.

e.g. if /home/ points to /var/home/, Path("/home/antoine/toto") should resolve to Path("/var/home/antoine/toto") even if toto doesn't actually exist.

However, this makes the function less safe. Perhaps with a "strict" flag?
msg205065 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-12-02 23:01
(note that POSIX realpath() fails with ENOENT if the file doesn't exist: http://pubs.opengroup.org/onlinepubs/9699919799/functions/realpath.html )
msg205066 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-02 23:19
Hm, so we can choose to be more like POSIX realpath() or more like os.path.realpath().  I guess your original intuition was right.  Close with no action is fine.  If I need a partial real path I can always crawl up parents() until I find a path that does exist.
msg205067 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-12-02 23:23
I think there's value in allowing the "less strict" behaviour with an optional arg, though. i.e.:

>>> Path('toto').resolve()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antoine/cpython/default/Lib/pathlib.py", line 1024, in resolve
    s = self._flavour.resolve(self)
  File "/home/antoine/cpython/default/Lib/pathlib.py", line 282, in resolve
    target = accessor.readlink(cur)
  File "/home/antoine/cpython/default/Lib/pathlib.py", line 372, in readlink
    return os.readlink(path)
FileNotFoundError: [Errno 2] No such file or directory: '/home/antoine/cpython/default/toto'
>>> Path('toto').resolve(strict=False)
PosixPath('/home/antoine/cpython/default/toto')
msg205068 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-12-02 23:39
Sure.
msg205079 - (view) Author: Vajrasky Kok (vajrasky) * Date: 2013-12-03 07:55
Here is the preliminary patch. Only tested on Linux. Later I'll test it on Windows.
msg205095 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-03 09:27
The readlink utility has different modes for canonization:

       -f, --canonicalize
              canonicalize by following every symlink in every component of the given name recursively; all but the last component must exist

       -e, --canonicalize-existing
              canonicalize by following every symlink in every component of the given name recursively, all components must exist

       -m, --canonicalize-missing
              canonicalize by following every symlink in every component of the given name recursively, without requirements on components existence

I think every mode has use cases.
msg205101 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-12-03 11:21
> I think every mode has use cases.

Probably, but which ones are the most likely? A ternary flag leads to a
clumsier API than a simple binary flag.
msg205779 - (view) Author: Vajrasky Kok (vajrasky) * Date: 2013-12-10 08:58
Thanks for the review, Antoine! Here is the updated patch. I haven't tested it on Windows yet because I want to clarify one thing.

Let's say we have this valid directory:

/tmp/@test123 <= directory

And this directory only has one valid file:

/tmp/@test123/cutecat <= file

We agree that pathlib.Path('/tmp/@test123/foo').resolve(False) => '/tmp/@test123/foo'.

But what about this case: pathlib.Path('/tmp/@test123/cutecat/foo').resolve(False)?

It should be "/tmp/@test123/cutecat" or "/tmp/@test123/cutecat/foo"?
msg205892 - (view) Author: Vajrasky Kok (vajrasky) * Date: 2013-12-11 08:22
Here is the patch with Windows support. I notice there is difference regarding resolving symbolic link with parent dir (linkA/..) between Posix and Windows.

On Windows, if linkY points to dirB, 'dirA\linkY\..' resolves to 'dirA' without resolving linkY first. It means, Windows resolves parent dir first before symbolic link.

C:\Users\vajrasky\Code\playplay\pycode>mkdir dirA

C:\Users\vajrasky\Code\playplay\pycode>mkdir dirB

C:\Users\vajrasky\Code\playplay\pycode>cd dirA

C:\Users\vajrasky\Code\playplay\pycode\dirA>mklink /D linkC ..\dirB
symbolic link created for linkC <<===>> ..\dirB

C:\Users\vajrasky\Code\playplay\pycode\dirA>cd ..\

C:\Users\vajrasky\Code\playplay\pycode>cd dirA\linkC\..

C:\Users\vajrasky\Code\playplay\pycode\dirA>

But on Posix, if linkY points to dirB, 'dirA\linkY\..' resolves to 'dirB\..' then to the parent dir of dirB. It means, Posix resolves symbolic link first before parent dir.

$ mkdir dirA
$ mkdir dirB
$ cd dirA
$ ln -s ../dirB linkC
$ cd ..
$ ls dirA/linkC/..
dirA    dirB
msg205897 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-11 09:03
I think a patch in issue19887 first should be committed. It totally rewrites resolve().
msg206397 - (view) Author: Vajrasky Kok (vajrasky) * Date: 2013-12-17 02:54
Updated patch to tip. Later, I refactor Windows code to make sure it does not loop forever.
msg206408 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-17 07:09
Vajrasky's patch implements fourth strategy, which is not conform neither --canonicalize nor --canonicalize-missing. Path(BASE, 'foo', 'in', 'spam') is resolved to Path(BASE, 'foo'). I doubt that this is most expected behavior.
msg206413 - (view) Author: Vajrasky Kok (vajrasky) * Date: 2013-12-17 08:20
I based my implementation on this statement:

"Guido pointed out that it may more useful to resolve the path components until one doesn't exist"

"until one doesn't exist" in this case means P(BASE, 'foo', 'in', 'spam') resolves until BASE / 'foo'.

If different spec is better, I'll change the implementation.
msg206415 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-17 09:05
I believe Guido meant one of standard strategies. Current posixpath.realpath() implementation conforms --canonicalize-missing.
It is not clear which two strategies (from three) should be used in Path.resolve().
msg206702 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-12-20 20:35
Well, given the diversity of possible behaviours, it starts to seem like it should maybe be discussed on python-dev.
msg265336 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-11 18:58
I have written a patch that implements all three mode for posixpath.realpath() (issue27002). Having the implementation we can test and research it. After discussion this solution can be adopted for Path.resolve().
msg275173 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-09-08 21:38
Guido has kept his opinion that it should resolve until the path no longer exists and then just stop: https://mail.python.org/pipermail/python-ideas/2016-September/042203.html
msg280294 - (view) Author: Martin Wagner (mwagner) Date: 2016-11-08 11:43
i have a use-case that requires a behavior that is referenced above as --canonicalize-missing. essentially i need to get rid of relative parts in a path representation regardless the actual filesystem.

my conclusion was that PurePath could provide that with a 'normpath' method, reflecting os.path.normpath's functionality.

(that's from a user perspective, i haven't looked at any implementation details of the pathlib module.)
msg280313 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-11-08 16:09
Please make sure this lands in beta 4!

--Guido (mobile)
msg280429 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2016-11-09 18:35
Anyone have any major concerns with add_non_strict_resolve_pathlib_v4.patch?

I'd be quite happy without adding the extra parameter to Path.resolve(), but I'm not strongly offended.

From Guido's email we should default to strict=False (i.e. don't throw if the file doesn't exist) rather than True.
msg280432 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-11-09 18:49
I'd be very happy if that landed in 3.6 with the default strict=False.
msg280450 - (view) Author: Roundup Robot (python-dev) Date: 2016-11-09 20:59
New changeset 03bbee2b0d28 by Steve Dower in branch '3.6':
Issue #19717: Makes Path.resolve() succeed on paths that do not exist (patch by Vajrasky Kok)
https://hg.python.org/cpython/rev/03bbee2b0d28

New changeset 445415e402be by Steve Dower in branch 'default':
Issue #19717: Makes Path.resolve() succeed on paths that do not exist (patch by Vajrasky Kok)
https://hg.python.org/cpython/rev/445415e402be
msg280451 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2016-11-09 21:00
Applied. I changed the default for the parameter and updated the docs, but the rest is as in the patch.
msg280452 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2016-11-09 21:09
Thanks!
History
Date User Action Args
2017-06-06 15:44:47gvanrossumsetnosy: - gvanrossum
2017-06-06 11:15:49Philip Ridoutsetnosy: + Philip Ridout
2017-05-28 13:44:07Dormouse759setpull_requests: + pull_request1930
2017-03-31 16:36:07dstufftsetpull_requests: + pull_request830
2016-11-14 19:59:00steve.dowersetstatus: open -> closed
2016-11-14 19:58:52steve.dowersetstage: commit review -> resolved
2016-11-09 21:09:09gvanrossumsetmessages: + msg280452
2016-11-09 21:00:13steve.dowersetassignee: steve.dower
resolution: fixed
messages: + msg280451
stage: commit review
2016-11-09 20:59:37python-devsetnosy: + python-dev
messages: + msg280450
2016-11-09 18:49:31gvanrossumsetmessages: + msg280432
2016-11-09 18:35:54steve.dowersetmessages: + msg280429
versions: + Python 3.7
2016-11-08 16:09:05gvanrossumsetmessages: + msg280313
2016-11-08 11:43:56mwagnersetnosy: + mwagner
messages: + msg280294
2016-09-09 04:32:15steve.dowersetnosy: + steve.dower
2016-09-08 21:39:40brett.cannonsetpriority: normal -> deferred blocker
2016-09-08 21:39:32brett.cannonlinkissue28031 superseder
2016-09-08 21:38:57brett.cannonsetnosy: + brett.cannon
messages: + msg275173
2016-05-11 18:58:58serhiy.storchakasetmessages: + msg265336
versions: + Python 3.6, - Python 3.4
2016-05-08 18:13:10davide.rizzosetnosy: + davide.rizzo
2016-05-08 12:37:54serhiy.storchakalinkissue26976 superseder
2014-09-25 10:28:23julysetnosy: + july
2013-12-20 20:35:36pitrousetmessages: + msg206702
2013-12-17 09:05:05serhiy.storchakasetmessages: + msg206415
2013-12-17 08:20:33vajraskysetmessages: + msg206413
2013-12-17 07:09:37serhiy.storchakasetmessages: + msg206408
2013-12-17 02:54:31vajraskysetfiles: + add_non_strict_resolve_pathlib_v4.patch

messages: + msg206397
2013-12-11 09:03:43serhiy.storchakasetdependencies: + Path.resolve() fails on complex symlinks
messages: + msg205897
2013-12-11 08:23:44vajraskysetfiles: + add_non_strict_resolve_pathlib_v3.patch
2013-12-11 08:23:36vajraskysetfiles: - add_non_strict_resolve_pathlib_v3.patch
2013-12-11 08:22:56vajraskysetfiles: + add_non_strict_resolve_pathlib_v3.patch

messages: + msg205892
2013-12-10 08:58:46vajraskysetfiles: + add_non_strict_resolve_pathlib_v2.patch

messages: + msg205779
2013-12-03 11:21:38pitrousetmessages: + msg205101
2013-12-03 09:27:52serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg205095
2013-12-03 07:55:46vajraskysetfiles: + add_non_strict_resolve_pathlib.patch

nosy: + vajrasky
messages: + msg205079

keywords: + patch
2013-12-02 23:39:08gvanrossumsetmessages: + msg205068
2013-12-02 23:23:37pitrousetmessages: + msg205067
2013-12-02 23:19:44gvanrossumsetmessages: + msg205066
2013-12-02 23:01:34pitrousetmessages: + msg205065
2013-11-22 17:45:53pitroucreate