Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os.path.dirname doesn't handle Windows' URNs correctly #71590

Closed
DustinOprea mannequin opened this issue Jun 27, 2016 · 4 comments
Closed

os.path.dirname doesn't handle Windows' URNs correctly #71590

DustinOprea mannequin opened this issue Jun 27, 2016 · 4 comments
Labels
OS-windows type-bug An unexpected behavior, bug, or error

Comments

@DustinOprea
Copy link
Mannequin

DustinOprea mannequin commented Jun 27, 2016

BPO 27403
Nosy @pfmoore, @tjguk, @zware, @eryksun, @zooba

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2016-06-27.22:08:52.682>
created_at = <Date 2016-06-27.21:03:41.924>
labels = ['type-bug', 'invalid', 'OS-windows']
title = "os.path.dirname doesn't handle Windows' URNs correctly"
updated_at = <Date 2016-06-28.00:36:05.814>
user = 'https://bugs.python.org/DustinOprea'

bugs.python.org fields:

activity = <Date 2016-06-28.00:36:05.814>
actor = 'eryksun'
assignee = 'none'
closed = True
closed_date = <Date 2016-06-27.22:08:52.682>
closer = 'eryksun'
components = ['Windows']
creation = <Date 2016-06-27.21:03:41.924>
creator = 'Dustin.Oprea'
dependencies = []
files = []
hgrepos = []
issue_num = 27403
keywords = []
message_count = 4.0
messages = ['269404', '269408', '269410', '269412']
nosy_count = 6.0
nosy_names = ['paul.moore', 'tim.golden', 'zach.ware', 'eryksun', 'steve.dower', 'Dustin.Oprea']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue27403'
versions = ['Python 2.7', 'Python 3.5']

@DustinOprea
Copy link
Mannequin Author

DustinOprea mannequin commented Jun 27, 2016

Notice that os.path.dirname() returns whatever it is given if it is given a URN, regardless of slash-type. Oddly, you have to double-up the forward-slashes (like you're escaping them) in order to get the correct result (if you're using forward-slashes). Back-slashes appear to be broken no matter what.

C:\Python35-32>python
Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os.path
>>> os.path.dirname("\\\\a\\b")
'\\\\a\\b'
>>> os.path.dirname("//a/b")
'//a/b'
>>> os.path.dirname("////a//b")
'////a'

Any ideas?

@DustinOprea DustinOprea mannequin added OS-windows type-bug An unexpected behavior, bug, or error labels Jun 27, 2016
@eryksun eryksun closed this as completed Jun 27, 2016
@eryksun
Copy link
Contributor

eryksun commented Jun 27, 2016

dirname() is implemented via split(), which begins by calling splitdrive(). The 'drive' for a UNC path is the r"\\server\share" component. For example:

    >>> path = r'\\server\share\folder\file'
    >>> os.path.splitdrive(path)
    ('\\\\server\\share', '\\folder\\file')
    >>> os.path.split(path)
    ('\\\\server\\share\\folder', 'file')
    >>> os.path.dirname(path)
    '\\\\server\\share\\folder'

If you double the initial slashes, it's no longer a valid UNC path:

    >>> path = r'\\\\server\\share\\folder\\file'
    >>> os.path.splitdrive(path)
    ('', '\\\\\\\\server\\\\share\\\\folder\\\\file')
    >>> os.path.split(path)
    ('\\\\\\\\server\\\\share\\\\folder', 'file')
    >>> os.path.dirname(path)
    '\\\\\\\\server\\\\share\\\\folder'

Windows itself will attempt to handle it as a UNC path, but the path is invalid. Specifically, before passing the path to the kernel, Windows collapses all of the extra slashes, except an initial slash count greater than two always leaves an extra slash in the path. For example:

    >>> open(r'\\\\server\\share\\folder\\file')
    Breakpoint 0 hit
    ntdll!NtCreateFile:
    00007ffb`a1f25b70 4c8bd1          mov     r10,rcx
    0:000> !obja @r8
    Obja +00000049781ef160 at 00000049781ef160:
            Name is \??\UNC\\server\share\folder\file
            OBJ_CASE_INSENSITIVE

Notice the extra backlash in "UNC\\server". Thus a valid UNC path must start with exactly two slashes.

Using forward slash is generally fine. The Windows API substitutes backslash for slash before passing a path to the kernel. For example:

    >>> open(r'//server/share/folder/file')
    Breakpoint 0 hit
    ntdll!NtCreateFile:
    00007ffb`a1f25b70 4c8bd1          mov     r10,rcx
    0:000> !obja @r8
    Obja +00000049781ef160 at 00000049781ef160:
            Name is \??\UNC\server\share\folder\file
            OBJ_CASE_INSENSITIVE

Except you can't use forward slash with a "\\?\" path, which bypasses normal path processing. For example:

    >>> open(r'\\?\UNC/server/share/folder/file')
    Breakpoint 0 hit
    ntdll!NtCreateFile:
    00007ffb`a1f25b70 4c8bd1          mov     r10,rcx
    0:000> !obja @r8
    Obja +00000049781ef160 at 00000049781ef160:
            Name is \??\UNC/server/share/folder/file
            OBJ_CASE_INSENSITIVE

In the kernel '/' isn't a path separator. It's just another name character, so this looks for a DOS device named "UNC/server/share/folder/file". Microsoft file systems forbid using slash in names (for POSIX compatibility and to avoid needless confusion), but you can use slash in the name of kernel objects such as Event objects, or even in the name of DOS devices defined via DefineDosDevice.

@DustinOprea
Copy link
Mannequin Author

DustinOprea mannequin commented Jun 27, 2016

Thank you for your elaborate response. I appreciate knowing that "\\server\share" could be considered as the "drive" portion of the path.

I'm having trouble determining if "\\?\" is literally some type of valid UNC prefix or you're just using it to represent some format/idea. Just curious.

@eryksun
Copy link
Contributor

eryksun commented Jun 28, 2016

Paths starting with "\\.\" (or "//./") and "\\?\" are not UNC paths. I've provided some explanations and examples below, and I also encourage you to read "Naming Files, Paths, and Namespaces":

https://msdn.microsoft.com/en-us/library/aa365247

"\\.\" is the general way to access DOS devices, but with some path processing still enabled. For example:

    >>> files = os.listdir(r'//./C:/Windows/System32/..')
    >>> [x for x in files if x[:2] == 'py']
    ['py.exe', 'pyw.exe']

Notice that using slash and ".." is allowed. This form doesn't allow relative paths that depend on per-drive current directories. It's actually not recommended to use "\\.\" to access files on drive letters. Normally it's used with drive letters only when directly opening a volume. For example:

    >>> fd = os.open(r'\\.\C:', os.O_RDONLY | os.O_BINARY)
    >>> os.read(fd, 512)[:7]
    b'\xebR\x90NTFS'

The "\\?\" prefix allows the most access to the NT kernel namespace from within the Windows API (e.g. file paths can be up to 32K characters instead of the DOS limit of 260 characters). It does so by disabling all path processing, which means the onus is on the programmer to provide a fully-qualified, Unicode path that only uses backslash as the path separator.

So why does "\\.\" exist? Some DOS devices are made implicitly available in the Windows API, such as DOS drive letters and "CON". However, the Windows API greatly extends the number of 'DOS' devices (e.g. the "PhysicalDrive0" device for low-level access to the first disk). Accessing these devices unambiguously requires the "\\.\" prefix. A common example is using "\\.\pipe\[pipe name]" to open a NamedPipe. You can even list the NamedPipe filesystem in Python. For example:

    >>> p1, p2 = multiprocessing.Pipe()
    >>> [x for x in os.listdir(r'\\.\pipe') if x[:2] == 'py']
    ['pyc-719-1-hoirbkzb']

Global DOS device names are defined in the kernel's "\Global??" directory. Some DOS devices, such as mapped network drives, are usually defined local to a logon session in the kernel's "\Sessions\0\DosDevices\[Logon Session ID]" directory. In the examples I gave, you may have noticed that each native kernel path starts with "\??\". This is a virtual directory in the kernel (and only the kernel). It instructs the object manager to first search the local session DOS devices and then the global DOS devices.

A DOS device is almost always implemented as an object symbolic link to the real NT device name in the kernel's "\Device" directory. For example, "\Global??\PIPE" links to "\Device\NamedPipe" and the "C:" drive may be a link to "\Device\HarddiskVolume2". This device is what the kernel actually opened in the previous example that read from "\\.\C:". Note that this accesses the volume itself, not the root directory of the filesystem that resides on the volume. The latter is "\\.C:\". The trailing backslash makes all the difference. (Opening a directory such as the latter requires backup semantics, as described in the CreateFile docs.)

If a DOS drive letter is assigned to a volume, the assignment is stored in the registry by the volume's ID. (Dynamic volumes that span multiple disks also contain a drive letter hint.) For volume devices, the kernel also creates a GUID name that's always available and allows mounting a volume in a directory using an NTFS reparse point (e.g. see the output of mountvol.exe). You can also use GUID volume names in the Windows API. For example:

    >>> path = r'\\?\Volume{1693b540-0000-0000-0000-612e00000000}\Windows'
    >>> files = os.listdir(path)
    >>> [x for x in files if x[:2] == 'py']
    ['py.exe', 'pyw.exe']

But normally you'd just mount the volume, which can even be recursively mounted within itself. For example:

    >>> os.mkdir('C:\\SystemVolume')
    >>> subprocess.call(r'mountvol C:\SystemVolume \\?\Volume{1693b540-0000-0000-0000-612e00000000}')
    0
    >>> files = os.listdir(r'C:\SystemVolume\Windows')
    >>> [x for x in files if x[:2] == 'py']
    ['py.exe', 'pyw.exe']

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-windows type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant