classification
Title: ntpath doesn't join paths correctly when a drive is present
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Bruce.Leban, andrei.duma, berker.peksag, gvanrossum, martin.panter, pitrou, python-dev, r.david.murray, serhiy.storchaka, tim.golden, valhallasw, zach.ware
Priority: normal Keywords: easy, patch

Created on 2013-10-30 23:12 by gvanrossum, last changed 2014-01-28 05:48 by berker.peksag. This issue is now closed.

Files
File name Uploaded Description Edit
fix_ntpath_join.patch andrei.duma, 2013-11-03 22:54 review
ntpath_join.patch serhiy.storchaka, 2013-12-06 17:59 review
ntpath_join_2.patch serhiy.storchaka, 2014-01-11 16:17 review
Messages (18)
msg201784 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-10-30 23:12
(Bruce Leban, on python-ideas:)

"""
ntpath still gets drive-relative paths wrong on Windows:

>>> ntpath.join(r'\\a\b\c\d', r'\e\f')
'\\e\\f'  
# should be r'\\a\b\e\f'

>>> ntpath.join(r'C:\a\b\c\d', r'\e\f')
'\\e\\f'
# should be r'C:\e\f'

(same behavior in Python 2.7 and 3.3)
"""

(Let's also make sure PEP 428 / pathlib fixes this.)
msg202054 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-11-03 20:32
Looking at this some more, I think one of the reasons is that isabs() does not consider paths consisting of *just* a drive (either c: or \\host\share) to be absolute, but it considers a path without a drive but starting with a \ as absolute. So perhaps it's all internally inconsistent. I'm hoping Bruce has something to say to this.
msg202058 - (view) Author: Andrei Dorian Duma (andrei.duma) * Date: 2013-11-03 20:51
I'm willing to fix this. ntpath.join behaves weird in other situations too:

>>> ntpath.join('C:/a/b', 'D:x/y')
'C:/a/b\\D:x/y'

In fact, I don't know what the above should return.
msg202059 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-11-03 21:02
PEP 428 offers a reasonable view. Search http://www.python.org/dev/peps/pep-0428/ for "anchored" and read on.
msg202065 - (view) Author: Andrei Dorian Duma (andrei.duma) * Date: 2013-11-03 22:54
Added a possible fix for ntpath.join. Didn't touch isabs yet.
msg202071 - (view) Author: Bruce Leban (Bruce.Leban) Date: 2013-11-04 00:27
A non-UNC windows path consists of two parts: a drive and a conventional path. If the drive is left out, it's relative to the current drive. If the path part does not have a leading \ then it's relative to the current path on that drive. Note that Windows has a different working dir for every drive.

x\y.txt    # in dir x in current dir on current drive
\x\y.txt   # in dir x at root of current drive
E:x\y.txt  # in dir in current dir on drive E
E:\x\y.txt # in dir x at root of drive E

UNC paths are similar except \\server\share is used instead of X: and there are no relative paths, since the part after share always starts with a \.

Thus when joining paths, if the second path specifies a drive, then the result should include that drive, otherwise the drive from the first path should be used. The path parts should be combined with the standard logic.

Some additional test cases

tester("ntpath.join(r'C:/a/b/c/d', '/e/f')", 'C:\e\f')
tester("ntpath.join('//a/b/c/d', '/e/f')", '//a/b/e/f')
tester("ntpath.join('C:x/y', r'z')", r'C:x/y/z')
tester("ntpath.join('C:x/y', r'/z')", r'C:/z')

Andrei notes that the following is wrong but wonders what the correct answer is:

>>> ntpath.join('C:/a/b', 'D:x/y')
'C:/a/b\\D:x/y'

The /a/b part of the path is an absolute path on drive C and isn't "transferable" to another drive. So a reasonable result is simply 'D:x/y'. This matches Windows behavior. If on Windows you did

$ cd /D C:\a\b
$ cat D:x\y

it would ignore the current drive on C set by the first command and use the current drive on D.

tester("ntpath.join('C:/a/b', 'D:x/y')", r'D:x/y')
tester("ntpath.join('//c/a/b', 'D:x/y')", r'D:x/y')
msg202073 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-11-04 00:38
Do we even have a way to get the current directory for a given drive? (I guess this is only needed for C: style drives.)
msg205391 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-12-06 17:58
With previous patch:

>>> ntpath.join('C:a/b', 'D:y/z')
'D:y/z\\y/z'

Should be 'D:y/z'.

Here is other patch which implements same algorithm as in pathlib (issue19908). Added new tests, removed duplicated tests.
msg207851 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-10 11:41
If there are no objections, I'll commit this patch tomorrow.
msg207907 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-11 16:17
I just discovered that perhaps ntpath.join should be even more clever. Windows supports current directories for every drive separately, so perhaps ntpath.join('c:/x', 'd:/y', 'c:z') should return 'c:/x\\z', not 'c:/z'.

Could anyone please check it? Create directory x/z on drive c: and directory y on drive d:, then execute following commands:

cd c:/x
cd d:/y
cd c:z

What is resulting current working directory?

Here is a patch which implements this algorithm.
msg207909 - (view) Author: Merlijn van Deen (valhallasw) * Date: 2014-01-11 19:42
> so perhaps ntpath.join('c:/x', 'd:/y', 'c:z') should return 'c:/x\\z', not 'c:/z'.

'c:z' is consistent with what .NET's System.IO.Path.Combine does:

via  http://ironpython.net/try/ :
import System.IO.Path; print System.IO.Path.Combine('c:/x', 'd:/y', 'c:z')

returns 'c:z'

> Could anyone please check it? Create directory x/z on drive c: and directory y on drive d:, then execute following commands:
> cd c:/x
> cd d:/y
> cd c:z
>
> What is resulting current working directory?

c:\>cd c:/x

c:\x>cd e:\y

c:\x>cd c:z
Het systeem kan het opgegeven pad niet vinden. # file not found, in Dutch

c:\x>cd c:\z



Yes, there is a seperate current directory for each drive, but cd does not switch drives. (cd e:\f does not mean you actually go to e:\f - it just changes the current directory on the e:-drive). I don't think those semantics are sensible for joining paths...
msg207910 - (view) Author: Merlijn van Deen (valhallasw) * Date: 2014-01-11 20:03
Sorry, I was a bit too quick - I forgot to create c:\x\z. Now this is the result:

c:\x\z>cd c:/x
c:\x>cd e:/y
c:\x>cd c:z
c:\x\z>

However, the behavior does not work in, for example, a 'Save as...' window, where c:z will always return "illegal filename"
msg209362 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-26 22:46
Thank you Merlijn for your information.

So which patch is more preferable?
msg209365 - (view) Author: Merlijn van Deen (valhallasw) * Date: 2014-01-26 23:24
I'm not sure whether that question was aimed at me -- I think both options have their merits, but I'd suggest to adopt the .NET semantics. The semantics are also explicitly defined [1] and the behavior seems to be acceptable for the .NET world. 

[1] http://msdn.microsoft.com/en-us/library/fyy7a5kt(v=vs.110).aspx
msg209474 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-01-27 20:31
I think a python programmer is going to expect that

  join(a, b, c) == join(join(a, b), c)

so the answer to Serhiy's example should be 'c:z'.
msg209482 - (view) Author: Roundup Robot (python-dev) Date: 2014-01-27 21:16
New changeset 6b314f5c9404 by Serhiy Storchaka in branch '2.7':
Issue #19456: ntpath.join() now joins relative paths correctly when a drive
http://hg.python.org/cpython/rev/6b314f5c9404

New changeset f4377699fd47 by Serhiy Storchaka in branch '3.3':
Issue #19456: ntpath.join() now joins relative paths correctly when a drive
http://hg.python.org/cpython/rev/f4377699fd47

New changeset 7ce464ba615a by Serhiy Storchaka in branch 'default':
Issue #19456: ntpath.join() now joins relative paths correctly when a drive
http://hg.python.org/cpython/rev/7ce464ba615a
msg209484 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014-01-27 21:21
Committed first patch (with small change, ntpath.join('c:', 'C:') now returns 'C:').

There is yet one argument for first option: it is almost impossible (with current design) to implement second option in pathlib.
msg209503 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2014-01-28 05:48
Hi Serhiy, there are commented-out lines in the 2.7 version of the patch. Are they intentionally there?:

+        #tester("ntpath.join('//computer/share', 'x/y')", '//computer/share\\x/y')
+        #tester("ntpath.join('//computer/share/', 'x/y')", '//computer/share/x/y')
+        #tester("ntpath.join('//computer/share/a/b', 'x/y')", '//computer/share/a/b\\x/y')

+        #tester("ntpath.join('//computer/share', '/x/y')", '//computer/share/x/y')
+        #tester("ntpath.join('//computer/share/', '/x/y')", '//computer/share/x/y')
+        #tester("ntpath.join('//computer/share/a', '/x/y')", '//computer/share/x/y')

http://hg.python.org/cpython/rev/6b314f5c9404
History
Date User Action Args
2014-01-28 05:48:16berker.peksagsetnosy: + berker.peksag
messages: + msg209503
2014-01-27 21:21:39serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg209484

stage: patch review -> resolved
2014-01-27 21:16:44python-devsetnosy: + python-dev
messages: + msg209482
2014-01-27 20:33:17brian.curtinsetnosy: + zach.ware, - brian.curtin
2014-01-27 20:31:08r.david.murraysetnosy: + r.david.murray
messages: + msg209474
2014-01-27 20:26:35serhiy.storchakasetnosy: + tim.golden, brian.curtin
2014-01-26 23:24:52valhallaswsetmessages: + msg209365
2014-01-26 22:46:48serhiy.storchakasetmessages: + msg209362
2014-01-11 20:03:56valhallaswsetmessages: + msg207910
2014-01-11 19:42:08valhallaswsetnosy: + valhallasw
messages: + msg207909
2014-01-11 16:17:45serhiy.storchakasetfiles: + ntpath_join_2.patch

messages: + msg207907
2014-01-10 11:41:58serhiy.storchakasetmessages: + msg207851
2013-12-06 17:59:09serhiy.storchakasetfiles: + ntpath_join.patch
2013-12-06 17:58:50serhiy.storchakasetversions: - Python 3.1, Python 3.2
nosy: + serhiy.storchaka

messages: + msg205391

assignee: serhiy.storchaka
stage: patch review
2013-11-04 00:38:43gvanrossumsetmessages: + msg202073
2013-11-04 00:27:10Bruce.Lebansetnosy: + Bruce.Leban
messages: + msg202071
2013-11-03 22:54:15andrei.dumasetfiles: + fix_ntpath_join.patch
keywords: + patch
messages: + msg202065
2013-11-03 21:02:40gvanrossumsetmessages: + msg202059
2013-11-03 20:51:27andrei.dumasetmessages: + msg202058
2013-11-03 20:32:55gvanrossumsetmessages: + msg202054
2013-11-03 19:12:39andrei.dumasetnosy: + andrei.duma
2013-11-02 00:05:25martin.pantersetnosy: + martin.panter
2013-10-30 23:12:37gvanrossumcreate