Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ntpath doesn't join paths correctly when a drive is present #63655

Closed
gvanrossum opened this issue Oct 30, 2013 · 18 comments
Closed

ntpath doesn't join paths correctly when a drive is present #63655

gvanrossum opened this issue Oct 30, 2013 · 18 comments
Assignees
Labels
easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@gvanrossum
Copy link
Member

BPO 19456
Nosy @gvanrossum, @pitrou, @tjguk, @bitdancer, @berkerpeksag, @vadmium, @zware, @serhiy-storchaka
Files
  • fix_ntpath_join.patch
  • ntpath_join.patch
  • ntpath_join_2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2014-01-27.21:21:39.040>
    created_at = <Date 2013-10-30.23:12:37.836>
    labels = ['easy', 'type-bug', 'library']
    title = "ntpath doesn't join paths correctly when a drive is present"
    updated_at = <Date 2014-01-28.05:48:16.316>
    user = 'https://github.com/gvanrossum'

    bugs.python.org fields:

    activity = <Date 2014-01-28.05:48:16.316>
    actor = 'berker.peksag'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2014-01-27.21:21:39.040>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2013-10-30.23:12:37.836>
    creator = 'gvanrossum'
    dependencies = []
    files = ['32486', '33008', '33417']
    hgrepos = []
    issue_num = 19456
    keywords = ['patch', 'easy']
    message_count = 18.0
    messages = ['201784', '202054', '202058', '202059', '202065', '202071', '202073', '205391', '207851', '207907', '207909', '207910', '209362', '209365', '209474', '209482', '209484', '209503']
    nosy_count = 12.0
    nosy_names = ['gvanrossum', 'pitrou', 'tim.golden', 'r.david.murray', 'valhallasw', 'python-dev', 'berker.peksag', 'martin.panter', 'zach.ware', 'serhiy.storchaka', 'elixir', 'Bruce.Leban']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue19456'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

    @gvanrossum
    Copy link
    Member Author

    (Bruce Leban, on python-ideas:)

    """
    ntpath still gets drive-relative paths wrong on Windows:

    >>> ntpath.join(r'\\a\b\c\d', r'\e\f')
    '\\e\\f'  
    # should be r'\\a\b\e\f'
    
    >>> ntpath.join(r'C:\a\b\c\d', r'\e\f')
    '\\e\\f'
    # should be r'C:\e\f'

    (same behavior in Python 2.7 and 3.3)
    """

    (Let's also make sure PEP-428 / pathlib fixes this.)

    @gvanrossum gvanrossum added stdlib Python modules in the Lib dir easy type-bug An unexpected behavior, bug, or error labels Oct 30, 2013
    @gvanrossum
    Copy link
    Member Author

    Looking at this some more, I think one of the reasons is that isabs() does not consider paths consisting of *just* a drive (either c: or \\host\share) to be absolute, but it considers a path without a drive but starting with a \ as absolute. So perhaps it's all internally inconsistent. I'm hoping Bruce has something to say to this.

    @elixir
    Copy link
    Mannequin

    elixir mannequin commented Nov 3, 2013

    I'm willing to fix this. ntpath.join behaves weird in other situations too:

    >>> ntpath.join('C:/a/b', 'D:x/y')
    'C:/a/b\\D:x/y'

    In fact, I don't know what the above should return.

    @gvanrossum
    Copy link
    Member Author

    PEP-428 offers a reasonable view. Search http://www.python.org/dev/peps/pep-0428/ for "anchored" and read on.

    @elixir
    Copy link
    Mannequin

    elixir mannequin commented Nov 3, 2013

    Added a possible fix for ntpath.join. Didn't touch isabs yet.

    @BruceLeban
    Copy link
    Mannequin

    BruceLeban mannequin commented Nov 4, 2013

    A non-UNC windows path consists of two parts: a drive and a conventional path. If the drive is left out, it's relative to the current drive. If the path part does not have a leading \ then it's relative to the current path on that drive. Note that Windows has a different working dir for every drive.

    x\y.txt # in dir x in current dir on current drive
    \x\y.txt # in dir x at root of current drive
    E:x\y.txt # in dir in current dir on drive E
    E:\x\y.txt # in dir x at root of drive E

    UNC paths are similar except \\server\share is used instead of X: and there are no relative paths, since the part after share always starts with a \.

    Thus when joining paths, if the second path specifies a drive, then the result should include that drive, otherwise the drive from the first path should be used. The path parts should be combined with the standard logic.

    Some additional test cases

    tester("ntpath.join(r'C:/a/b/c/d', '/e/f')", 'C:\e\f')
    tester("ntpath.join('//a/b/c/d', '/e/f')", '//a/b/e/f')
    tester("ntpath.join('C:x/y', r'z')", r'C:x/y/z')
    tester("ntpath.join('C:x/y', r'/z')", r'C:/z')

    Andrei notes that the following is wrong but wonders what the correct answer is:

    >>> ntpath.join('C:/a/b', 'D:x/y')
    'C:/a/b\\D:x/y'

    The /a/b part of the path is an absolute path on drive C and isn't "transferable" to another drive. So a reasonable result is simply 'D:x/y'. This matches Windows behavior. If on Windows you did

    $ cd /D C:\a\b
    $ cat D:x\y

    it would ignore the current drive on C set by the first command and use the current drive on D.

    tester("ntpath.join('C:/a/b', 'D:x/y')", r'D:x/y')
    tester("ntpath.join('//c/a/b', 'D:x/y')", r'D:x/y')

    @gvanrossum
    Copy link
    Member Author

    Do we even have a way to get the current directory for a given drive? (I guess this is only needed for C: style drives.)

    @serhiy-storchaka
    Copy link
    Member

    With previous patch:

    >>> ntpath.join('C:a/b', 'D:y/z')
    'D:y/z\\y/z'

    Should be 'D:y/z'.

    Here is other patch which implements same algorithm as in pathlib (bpo-19908). Added new tests, removed duplicated tests.

    @serhiy-storchaka serhiy-storchaka self-assigned this Dec 6, 2013
    @serhiy-storchaka
    Copy link
    Member

    If there are no objections, I'll commit this patch tomorrow.

    @serhiy-storchaka
    Copy link
    Member

    I just discovered that perhaps ntpath.join should be even more clever. Windows supports current directories for every drive separately, so perhaps ntpath.join('c:/x', 'd:/y', 'c:z') should return 'c:/x\\z', not 'c:/z'.

    Could anyone please check it? Create directory x/z on drive c: and directory y on drive d:, then execute following commands:

    cd c:/x
    cd d:/y
    cd c:z

    What is resulting current working directory?

    Here is a patch which implements this algorithm.

    @valhallasw
    Copy link
    Mannequin

    valhallasw mannequin commented Jan 11, 2014

    so perhaps ntpath.join('c:/x', 'd:/y', 'c:z') should return 'c:/x\\z', not 'c:/z'.

    'c:z' is consistent with what .NET's System.IO.Path.Combine does:

    via http://ironpython.net/try/ :
    import System.IO.Path; print System.IO.Path.Combine('c:/x', 'd:/y', 'c:z')

    returns 'c:z'

    Could anyone please check it? Create directory x/z on drive c: and directory y on drive d:, then execute following commands:
    cd c:/x
    cd d:/y
    cd c:z

    What is resulting current working directory?

    c:\>cd c:/x

    c:\x>cd e:\y

    c:\x>cd c:z
    Het systeem kan het opgegeven pad niet vinden. # file not found, in Dutch

    c:\x>cd c:\z

    Yes, there is a seperate current directory for each drive, but cd does not switch drives. (cd e:\f does not mean you actually go to e:\f - it just changes the current directory on the e:-drive). I don't think those semantics are sensible for joining paths...

    @valhallasw
    Copy link
    Mannequin

    valhallasw mannequin commented Jan 11, 2014

    Sorry, I was a bit too quick - I forgot to create c:\x\z. Now this is the result:

    c:\x\z>cd c:/x
    c:\x>cd e:/y
    c:\x>cd c:z
    c:\x\z>

    However, the behavior does not work in, for example, a 'Save as...' window, where c:z will always return "illegal filename"

    @serhiy-storchaka
    Copy link
    Member

    Thank you Merlijn for your information.

    So which patch is more preferable?

    @valhallasw
    Copy link
    Mannequin

    valhallasw mannequin commented Jan 26, 2014

    I'm not sure whether that question was aimed at me -- I think both options have their merits, but I'd suggest to adopt the .NET semantics. The semantics are also explicitly defined [1] and the behavior seems to be acceptable for the .NET world.

    [1] http://msdn.microsoft.com/en-us/library/fyy7a5kt(v=vs.110).aspx

    @bitdancer
    Copy link
    Member

    I think a python programmer is going to expect that

    join(a, b, c) == join(join(a, b), c)

    so the answer to Serhiy's example should be 'c:z'.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jan 27, 2014

    New changeset 6b314f5c9404 by Serhiy Storchaka in branch '2.7':
    Issue bpo-19456: ntpath.join() now joins relative paths correctly when a drive
    http://hg.python.org/cpython/rev/6b314f5c9404

    New changeset f4377699fd47 by Serhiy Storchaka in branch '3.3':
    Issue bpo-19456: ntpath.join() now joins relative paths correctly when a drive
    http://hg.python.org/cpython/rev/f4377699fd47

    New changeset 7ce464ba615a by Serhiy Storchaka in branch 'default':
    Issue bpo-19456: ntpath.join() now joins relative paths correctly when a drive
    http://hg.python.org/cpython/rev/7ce464ba615a

    @serhiy-storchaka
    Copy link
    Member

    Committed first patch (with small change, ntpath.join('c:', 'C:') now returns 'C:').

    There is yet one argument for first option: it is almost impossible (with current design) to implement second option in pathlib.

    @berkerpeksag
    Copy link
    Member

    Hi Serhiy, there are commented-out lines in the 2.7 version of the patch. Are they intentionally there?:

    + #tester("ntpath.join('//computer/share', 'x/y')", '//computer/share\\x/y')
    + #tester("ntpath.join('//computer/share/', 'x/y')", '//computer/share/x/y')
    + #tester("ntpath.join('//computer/share/a/b', 'x/y')", '//computer/share/a/b\\x/y')

    + #tester("ntpath.join('//computer/share', '/x/y')", '//computer/share/x/y')
    + #tester("ntpath.join('//computer/share/', '/x/y')", '//computer/share/x/y')
    + #tester("ntpath.join('//computer/share/a', '/x/y')", '//computer/share/x/y')

    http://hg.python.org/cpython/rev/6b314f5c9404

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants