classification
Title: UNC path normalisation issues on Windows
Type: behavior Stage: test needed
Components: Windows Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eryksun, neonene, paul.moore, steve.dower, tim.golden, zach.ware
Priority: normal Keywords:

Created on 2022-01-06 19:36 by steve.dower, last changed 2022-01-14 16:57 by eryksun.

Messages (4)
msg409903 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2022-01-06 19:36
Taken from https://github.com/python/cpython/pull/30362#issuecomment-1006840632

For Windows, should there be tests for invalid UNC paths such as "//", "//..", "//../..", "//../../..", "//server", "//server/..", and "//server/../.."? This will help to ensure that future changes never allow an invalid path to be normalized as a valid path.

Also, it's not a major problem that should prevent merging, but the way repeated slashes are handled prior to the second component of a UNC path is less than ideal:

>>> os.path.normpath('//spam///eggs')
'\\\\spam\\\\eggs'
>>> os.path.normpath('//spam///eggs/..')
'\\\\spam\\\\'
This case isn't a valid UNC share, since it's just "//spam", without a share component. However, the repeated slashes start the filepath part and should be reduced to a single backslash. That's what the GetFullPathNameW() call does in abspath():

>>> os.path.abspath('//spam///eggs')
'\\\\spam\\eggs'
>>> os.path.abspath('//spam///eggs/..')
'\\\\spam\\'
msg409908 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2022-01-06 20:00
My replies to Eryk's comment copied above:

Yes, always more tests :)

The behaviour of normpath has always been weird and/or incorrect around invalid UNC paths.

For example, on 3.10, normpath("//spam///eggs/..") --> "\\\\spam". Originally, the path was a file path (albeit with an invalid empty share name), and the final path is just a machine name.

Currently on 3.11, normpath("//spam///eggs/..") --> "\\\\spam\\\\". This doesn't match GetFullPathNameW, but at least it leaves the end of the path as a file (with an empty share name).

I don't think it's necessarily obvious which is correct, though matching GetFullPathNameW is certainly the easiest rule for us to use. Matching previous Python versions is also reasonable, though given the input is invalid for its domain we don't really have any obligation to preserve the result.
msg410056 - (view) Author: neonene (neonene) * Date: 2022-01-07 23:58
Regarding https://github.com/python/cpython/pull/30362#issuecomment-1005496892

_Py_abspath/_getfullpathname does not always call GetFullPathNameW on 3.11.

Python 3.10.1
>>> nt._getfullpathname('\\\\.\\C:////spam////eggs. . .')
'\\\\.\\C:\\spam\\eggs'

Python 3.11.0a3
>>> nt._getfullpathname('\\\\.\\C:////spam////eggs. . .')
'\\\\.\\C:////spam////eggs. . .'
msg410458 - (view) Author: neonene (neonene) * Date: 2022-01-13 03:34
> PathCchSkipRoot() doesn't recognize forward slash as a path separator,
 
I opened issue46362 and PR30571 about the mentioned abspath() behaviors.
History
Date User Action Args
2022-01-14 16:57:46eryksunsetmessages: - msg410068
2022-01-13 03:34:16neonenesetmessages: + msg410458
2022-01-08 01:39:03eryksunsetmessages: + msg410068
2022-01-07 23:58:47neonenesetnosy: + neonene
messages: + msg410056
2022-01-06 20:00:52steve.dowersetmessages: + msg409908
2022-01-06 19:36:40steve.dowercreate