Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolve() fails when the path doesn't exist #63916

Closed
pitrou opened this issue Nov 22, 2013 · 25 comments
Closed

resolve() fails when the path doesn't exist #63916

pitrou opened this issue Nov 22, 2013 · 25 comments
Assignees
Labels
3.7 (EOL) end of life deferred-blocker stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@pitrou
Copy link
Member

pitrou commented Nov 22, 2013

BPO 19717
Nosy @brettcannon, @pitrou, @tifv, @sorcio, @serhiy-storchaka, @zooba, @vajrasky
PRs
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • bpo-30177: Fix misleading description of path.resolve() #1649
  • Dependencies
  • bpo-19887: Path.resolve() fails on complex symlinks
  • Files
  • add_non_strict_resolve_pathlib.patch
  • add_non_strict_resolve_pathlib_v2.patch
  • add_non_strict_resolve_pathlib_v3.patch
  • add_non_strict_resolve_pathlib_v4.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/zooba'
    closed_at = <Date 2016-11-14.19:59:00.455>
    created_at = <Date 2013-11-22.17:45:53.337>
    labels = ['3.7', 'deferred-blocker', 'type-feature', 'library']
    title = "resolve() fails when the path doesn't exist"
    updated_at = <Date 2017-06-06.15:44:47.453>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2017-06-06.15:44:47.453>
    actor = 'gvanrossum'
    assignee = 'steve.dower'
    closed = True
    closed_date = <Date 2016-11-14.19:59:00.455>
    closer = 'steve.dower'
    components = ['Library (Lib)']
    creation = <Date 2013-11-22.17:45:53.337>
    creator = 'pitrou'
    dependencies = ['19887']
    files = ['32947', '33075', '33090', '33171']
    hgrepos = []
    issue_num = 19717
    keywords = ['patch']
    message_count = 25.0
    messages = ['203819', '205065', '205066', '205067', '205068', '205079', '205095', '205101', '205779', '205892', '205897', '206397', '206408', '206413', '206415', '206702', '265336', '275173', '280294', '280313', '280429', '280432', '280450', '280451', '280452']
    nosy_count = 11.0
    nosy_names = ['brett.cannon', 'pitrou', 'july', 'neologix', 'davide.rizzo', 'python-dev', 'serhiy.storchaka', 'steve.dower', 'vajrasky', 'mwagner', 'Philip Ridout']
    pr_nums = ['552', '1649']
    priority = 'deferred blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue19717'
    versions = ['Python 3.6', 'Python 3.7']

    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 22, 2013

    Currently Path.resolve() raises FileNotFoundError when the path does not exist. Guido pointed out that it may more useful to resolve the path components until one doesn't exist, and then return the rest unchanged.

    e.g. if /home/ points to /var/home/, Path("/home/antoine/toto") should resolve to Path("/var/home/antoine/toto") even if toto doesn't actually exist.

    However, this makes the function less safe. Perhaps with a "strict" flag?

    @pitrou pitrou added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Nov 22, 2013
    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 2, 2013

    (note that POSIX realpath() fails with ENOENT if the file doesn't exist: http://pubs.opengroup.org/onlinepubs/9699919799/functions/realpath.html )

    @gvanrossum
    Copy link
    Member

    Hm, so we can choose to be more like POSIX realpath() or more like os.path.realpath(). I guess your original intuition was right. Close with no action is fine. If I need a partial real path I can always crawl up parents() until I find a path that does exist.

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 2, 2013

    I think there's value in allowing the "less strict" behaviour with an optional arg, though. i.e.:

    >>> Path('toto').resolve()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/antoine/cpython/default/Lib/pathlib.py", line 1024, in resolve
        s = self._flavour.resolve(self)
      File "/home/antoine/cpython/default/Lib/pathlib.py", line 282, in resolve
        target = accessor.readlink(cur)
      File "/home/antoine/cpython/default/Lib/pathlib.py", line 372, in readlink
        return os.readlink(path)
    FileNotFoundError: [Errno 2] No such file or directory: '/home/antoine/cpython/default/toto'
    >>> Path('toto').resolve(strict=False)
    PosixPath('/home/antoine/cpython/default/toto')

    @gvanrossum
    Copy link
    Member

    Sure.

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented Dec 3, 2013

    Here is the preliminary patch. Only tested on Linux. Later I'll test it on Windows.

    @serhiy-storchaka
    Copy link
    Member

    The readlink utility has different modes for canonization:

       -f, --canonicalize
              canonicalize by following every symlink in every component of the given name recursively; all but the last component must exist
    
       -e, --canonicalize-existing
              canonicalize by following every symlink in every component of the given name recursively, all components must exist
    
       -m, --canonicalize-missing
              canonicalize by following every symlink in every component of the given name recursively, without requirements on components existence
    

    I think every mode has use cases.

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 3, 2013

    I think every mode has use cases.

    Probably, but which ones are the most likely? A ternary flag leads to a
    clumsier API than a simple binary flag.

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented Dec 10, 2013

    Thanks for the review, Antoine! Here is the updated patch. I haven't tested it on Windows yet because I want to clarify one thing.

    Let's say we have this valid directory:

    /tmp/@test123 <= directory

    And this directory only has one valid file:

    /tmp/@test123/cutecat <= file

    We agree that pathlib.Path('/tmp/@test123/foo').resolve(False) => '/tmp/@test123/foo'.

    But what about this case: pathlib.Path('/tmp/@test123/cutecat/foo').resolve(False)?

    It should be "/tmp/@test123/cutecat" or "/tmp/@test123/cutecat/foo"?

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented Dec 11, 2013

    Here is the patch with Windows support. I notice there is difference regarding resolving symbolic link with parent dir (linkA/..) between Posix and Windows.

    On Windows, if linkY points to dirB, 'dirA\linkY\..' resolves to 'dirA' without resolving linkY first. It means, Windows resolves parent dir first before symbolic link.

    C:\Users\vajrasky\Code\playplay\pycode>mkdir dirA

    C:\Users\vajrasky\Code\playplay\pycode>mkdir dirB

    C:\Users\vajrasky\Code\playplay\pycode>cd dirA

    C:\Users\vajrasky\Code\playplay\pycode\dirA>mklink /D linkC ..\dirB
    symbolic link created for linkC <<===>> ..\dirB

    C:\Users\vajrasky\Code\playplay\pycode\dirA>cd ..\

    C:\Users\vajrasky\Code\playplay\pycode>cd dirA\linkC\..

    C:\Users\vajrasky\Code\playplay\pycode\dirA>

    But on Posix, if linkY points to dirB, 'dirA\linkY\..' resolves to 'dirB\..' then to the parent dir of dirB. It means, Posix resolves symbolic link first before parent dir.

    $ mkdir dirA
    $ mkdir dirB
    $ cd dirA
    $ ln -s ../dirB linkC
    $ cd ..
    $ ls dirA/linkC/..
    dirA    dirB

    @serhiy-storchaka
    Copy link
    Member

    I think a patch in bpo-19887 first should be committed. It totally rewrites resolve().

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented Dec 17, 2013

    Updated patch to tip. Later, I refactor Windows code to make sure it does not loop forever.

    @serhiy-storchaka
    Copy link
    Member

    Vajrasky's patch implements fourth strategy, which is not conform neither --canonicalize nor --canonicalize-missing. Path(BASE, 'foo', 'in', 'spam') is resolved to Path(BASE, 'foo'). I doubt that this is most expected behavior.

    @vajrasky
    Copy link
    Mannequin

    vajrasky mannequin commented Dec 17, 2013

    I based my implementation on this statement:

    "Guido pointed out that it may more useful to resolve the path components until one doesn't exist"

    "until one doesn't exist" in this case means P(BASE, 'foo', 'in', 'spam') resolves until BASE / 'foo'.

    If different spec is better, I'll change the implementation.

    @serhiy-storchaka
    Copy link
    Member

    I believe Guido meant one of standard strategies. Current posixpath.realpath() implementation conforms --canonicalize-missing.
    It is not clear which two strategies (from three) should be used in Path.resolve().

    @pitrou
    Copy link
    Member Author

    pitrou commented Dec 20, 2013

    Well, given the diversity of possible behaviours, it starts to seem like it should maybe be discussed on python-dev.

    @serhiy-storchaka
    Copy link
    Member

    I have written a patch that implements all three mode for posixpath.realpath() (bpo-27002). Having the implementation we can test and research it. After discussion this solution can be adopted for Path.resolve().

    @brettcannon
    Copy link
    Member

    Guido has kept his opinion that it should resolve until the path no longer exists and then just stop: https://mail.python.org/pipermail/python-ideas/2016-September/042203.html

    @mwagner
    Copy link
    Mannequin

    mwagner mannequin commented Nov 8, 2016

    i have a use-case that requires a behavior that is referenced above as --canonicalize-missing. essentially i need to get rid of relative parts in a path representation regardless the actual filesystem.

    my conclusion was that PurePath could provide that with a 'normpath' method, reflecting os.path.normpath's functionality.

    (that's from a user perspective, i haven't looked at any implementation details of the pathlib module.)

    @gvanrossum
    Copy link
    Member

    Please make sure this lands in beta 4!

    --Guido (mobile)

    @zooba
    Copy link
    Member

    zooba commented Nov 9, 2016

    Anyone have any major concerns with add_non_strict_resolve_pathlib_v4.patch?

    I'd be quite happy without adding the extra parameter to Path.resolve(), but I'm not strongly offended.

    From Guido's email we should default to strict=False (i.e. don't throw if the file doesn't exist) rather than True.

    @zooba zooba added the 3.7 (EOL) end of life label Nov 9, 2016
    @gvanrossum
    Copy link
    Member

    I'd be very happy if that landed in 3.6 with the default strict=False.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 9, 2016

    New changeset 03bbee2b0d28 by Steve Dower in branch '3.6':
    Issue bpo-19717: Makes Path.resolve() succeed on paths that do not exist (patch by Vajrasky Kok)
    https://hg.python.org/cpython/rev/03bbee2b0d28

    New changeset 445415e402be by Steve Dower in branch 'default':
    Issue bpo-19717: Makes Path.resolve() succeed on paths that do not exist (patch by Vajrasky Kok)
    https://hg.python.org/cpython/rev/445415e402be

    @zooba
    Copy link
    Member

    zooba commented Nov 9, 2016

    Applied. I changed the default for the parameter and updated the docs, but the rest is as in the patch.

    @zooba zooba self-assigned this Nov 9, 2016
    @gvanrossum
    Copy link
    Member

    Thanks!

    @zooba zooba closed this as completed Nov 14, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life deferred-blocker stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants