This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eryksun
Recipients eryksun, paul.moore, serhiy.storchaka, steve.dower, tim.golden, zach.ware
Date 2017-03-26.12:54:32
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1490532872.79.0.603014732784.issue27827@psf.upfronthosting.co.za>
In-reply-to
Content
For COM[n] and LPT[n], only ASCII 1-9 and superscript 1-3 (U+00b9, U+00b2, and U+00b3) are handled as decimal digits. For example:

    >>> print(*(ascii(chr(c)) for c in range(1, 65536)
    ...     if _getfullpathname('COM%s' % chr(c))[0] == '\\'), sep=', ')
    '1', '2', '3', '4', '5', '6', '7', '8', '9', '\xb2', '\xb3', '\xb9'

The implementation uses iswdigit in ntdll.dll. (ntdll.dll is the system DLL that has the user-mode runtime library and syscall stubs -- except the Win32k syscall stubs are in win32u.dll.) ntdll's private CRT uses the C locale (Latin-1, not just ASCII), and it classifies these superscript digits as decimal digits:

    >>> ntdll = ctypes.WinDLL('ntdll')
    >>> print(*(chr(c) for c in range(1, 65536) if ntdll.iswdigit(c)))
    0 1 2 3 4 5 6 7 8 9 ² ³ ¹

Unicode, and thus Python, does not classify these superscript digits as decimal digits, so I just hard-coded the list. 

Here's an example with an attached debugger to show the runtime library calling iswdigit:

    >>> name = 'COM\u2074'
    >>> _getfullpathname(name)

    Breakpoint 0 hit
    ntdll!iswdigit:
    00007ffe`9ad89d90 ba04000000      mov     edx,4
    0:000> kc 6
    Call Site
    ntdll!iswdigit
    ntdll!RtlpIsDosDeviceName_Ustr
    ntdll!RtlGetFullPathName_Ustr
    ntdll!RtlGetFullPathName_UEx
    KERNELBASE!GetFullPathNameW
    python36_d!os__getfullpathname_impl

The argument is in register rcx:

    0:000> r rcx
    rcx=0000000000002074

Skip to the ret instruction, and check the result in register rax:

    0:000> pt
    ntdll!iswctype+0x20:
    00007ffe`9ad89e40 c3              ret
    0:000> r rax
    rax=0000000000000000
    0:000> g

Since U+2074 isn't considered a decimal digit, 'COM⁴' is not a reserved DOS device name. The system handles it as a regular filename:

    'C:\\Temp\\COM⁴'
History
Date User Action Args
2017-03-26 12:54:32eryksunsetrecipients: + eryksun, paul.moore, tim.golden, zach.ware, serhiy.storchaka, steve.dower
2017-03-26 12:54:32eryksunsetmessageid: <1490532872.79.0.603014732784.issue27827@psf.upfronthosting.co.za>
2017-03-26 12:54:32eryksunlinkissue27827 messages
2017-03-26 12:54:32eryksuncreate