Author eryksun
Recipients Drekin, abarry, eryksun, ezio.melotti, paul.moore, steve.dower, tim.golden, vstinner, zach.ware
Date 2016-07-08.21:27:23
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1468013243.98.0.619398990825.issue27469@psf.upfronthosting.co.za>
In-reply-to
Content
Nothing can be done about this from Python. It's a bug in how Explorer handles the dropped filename. 

Note that it's not simply replacing Unicode characters with question marks. It's using a best-fit ANSI encoding. For example, codepage 1252 maps "Ā" to "A". If there's no defined best-fit mapping, most codepages default to using "?" as the replacement character when encoding via WideCharToMultiByte. When decoding via MultiByteToWideChar, some codepages (e.g. 932), use katakana middle dot (U+30FB) as the default instead of a question mark.

For example, here's the commandline of py.exe when I drop a file named "Ā.txt" on a script. Note that "Ā" becomes "A":

    0:000> ?? @$peb->ProcessParameters->CommandLine
    struct _UNICODE_STRING
     ""C:\Windows\py.exe" "C:\Temp\test.py" C:\Temp\A.txt "
       +0x000 Length           : 0x68
       +0x002 MaximumLength    : 0x6a
       +0x004 Buffer           : 0x00771d50  ""C:\Windows\py.exe" "C:\Temp\test.py" C:\Temp\A.txt "

It's bizarre that it encodes the filename as ANSI just to decode it later when it calls CreateProcess. But Explorer probably still has a lot old code from back when it had to run on both Windows NT and DOS-based Windows 9x. This is probably a vestige of some workaround.

It isn't a problem if you ShellExecuteEx the Python script. For example, I ran "C:\Temp\test.py C:\Temp\Ā.txt" in the command prompt, and here's the resulting command line:

    0:000> ?? @$peb->ProcessParameters->CommandLine
    struct _UNICODE_STRING
     ""C:\Windows\py.exe" "C:\Temp\test.py"  C:\Temp\Ā.txt"
       +0x000 Length           : 0x68
       +0x002 MaximumLength    : 0x6a
       +0x004 Buffer           : 0x00981cf8  ""C:\Windows\py.exe" "C:\Temp\test.py"  C:\Temp\Ā.txt"

Explorer actually handles drag and drop correctly when dropping the file on another window. So as a (clunky) workaround, you can drag the script icon into a command prompt or Win+R run dialog, and then drag the target file. The shell should add quotes where required.
History
Date User Action Args
2016-07-08 21:27:24eryksunsetrecipients: + eryksun, paul.moore, vstinner, tim.golden, ezio.melotti, zach.ware, Drekin, steve.dower, abarry
2016-07-08 21:27:23eryksunsetmessageid: <1468013243.98.0.619398990825.issue27469@psf.upfronthosting.co.za>
2016-07-08 21:27:23eryksunlinkissue27469 messages
2016-07-08 21:27:23eryksuncreate