This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eryksun
Recipients eryksun, 徐彻
Date 2020-02-01.17:47:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1580579270.4.0.898576270968.issue39515@roundup.psfhosted.org>
In-reply-to
Content
A Windows path reserves the following characters:

* null, as the string terminator
* slash and backslash, as path separators
* colon as the second character in the first component of
  a non-UNC path, since it's a drive path

Additionally, a normalized path reserves trailing dots and spaces on names, since they get stripped from the final component (e.g. "C:\Temp\spam. . ." -> "C:\Temp\spam"). WindowsPath could automatically strip trailing dots and space from normalized paths. This would need to exclude extended paths that begin with the "\\?\" prefix.

Otherwise the set of reserved characters is a function of device and filesystem namespaces, regardless of the recommendations in "Naming Files, Paths, and Namespaces" [1], which are meant to constrain applications to what is generally allowed. I would prefer for WindowsPath to remain generic enough to support all device and filesystem namespaces. 

For example, the VirtualBox shared-folder filesystem (a mini-redirector to the host system) allows colon, pipe, and control characters in file and directory names:

    >>> control = '\a\b\t\n\v\f\r'
    >>> special = ':|'
    >>> dirname = f'//vboxsvr/work/nametest/{control}{special}'
    >>> os.makedirs(dirname, exist_ok=True)
    >>> os.listdir('//vboxsvr/work/nametest')[0]
    '\x07\x08\t\n\x0b\x0c\r:|'

Like most filesystems, it reserves the 5 wildcard characters in base filenames, which includes '*', '?', '<' (DOS_STAR), '>' (DOS_QM), and '"' (DOS_DOT). A filesystem that fails to reserve these wildcard characters cannot properly support WINAPI FindFirstFile[Ex]. The only filesystem I can think of that allows wildcard characters in base names is the named-pipe filesystem. NPFS actually allows any character in a pipe name -- even slash and backslash since it only supports a single directory, the root directory "//./PIPE/".

That said, a path may specify a stream name instead of a base filename. As is documented in [1], and NTFS stream name reserves colon as a delimiter, i.e. "filename:streamname:streamtype", and stream names can include wildcards, pipe, and control characters. For example:

    >>> control = '\a\b\t\n\v\f\r'
    >>> special = '*?<>"|'
    >>> dirname = 'C:\\Temp\\nametest'
    >>> filename = f'{dirname}\\spam'
    >>> streamname = f'{filename}:{control}{special}'
    >>> os.makedirs(dirname, exist_ok=True)
    >>> streamname
    'C:\\Temp\\nametest\\spam:\x07\x08\t\n\x0b\x0c\r*?<>"|'
    >>> open(streamname, 'w').close()

We can use PowerShell (pwsh) to verify the existence of the stream:

    >>> cmd = f'pwsh -c (gi "{filename}" -stream *)[1].Stream'
    >>> subprocess.check_output(cmd, text=True).rstrip()
    '\x07\x08\t\n\x0b\x0c\n*?<>"|'

In terms of device namespaces, a device that is not mounted by a filesystem can implement practically whatever namespace it wants. But considering "//./" device paths are normalized Windows paths, device namespaces should reserve slash, since the system translates slash to backslash.

[1] https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file
History
Date User Action Args
2020-02-01 17:47:50eryksunsetrecipients: + eryksun, 徐彻
2020-02-01 17:47:50eryksunsetmessageid: <1580579270.4.0.898576270968.issue39515@roundup.psfhosted.org>
2020-02-01 17:47:50eryksunlinkissue39515 messages
2020-02-01 17:47:49eryksuncreate