This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author davispuh
Recipients davispuh, eryksun, ezio.melotti, martin.panter, paul.moore, steve.dower, tim.golden, vstinner, zach.ware
Date 2016-06-04.01:53:27
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1465005209.34.0.503048719983.issue27179@psf.upfronthosting.co.za>
In-reply-to
Content
> qprocess.exe (also to console)
> quser.exe (also to console)

these are broken (http://i.imgur.com/0zIhHrv.png)

    >chcp 1257
    >quser
     USERNAME              SESSIONNAME
     dƒvis                 console

    > chcp 775
    > quser
     USERNAME              SESSIONNAME
     dāvis                 console


we've to decide which codepage to use as default and it should cover most cases not some minority of programs so I would say using console's code page when it's available makes the most sense and when isn't then fallback to ANSI codepage

Now for these special cases where our guess is wrong only user can know which encoding would be right and so he must specify that.


I also checked that cmd /u flag is totally useless because it applies only to cmd itself not to any other programs and so to use it would need to check if returned output is actual UTF-16 or some other encoding which might even pass as valid UTF-16

for example:
    cmd /u /c "echo ā"
will return
ā in UTF-16

but
    cmd /u /c "sc query"

result will be encoded in OEM codepage (775 for me) and no sign of UTF-16


I looked if there's some function to get used encoding for child process but there isn't, I would have expected something like GetConsoleOutputCP(hThread)
So the only way to get it, is by calling GetConsoleOutputCP inside child process with CreateRemoteThread and it's not really pretty and quite hacky, but it does work, I tested.

anyway even with that would need to change something about TextIOWrapper because we're creating it before process is even started and encoding isn't changeable later.




I updated patch which fixes issues with creationflags and also added option to change encoding based on subprocess3.patch (from #6135)

so now with my patch it really works for most cases.

    >python -c "import subprocess; subprocess.getstatusoutput('ā')"

works correctly for me with correct encoding when console's code page is set to any of 775 (OEM), 1257 (ANSI) and 65001 (UTF-8)

it also works correctly with any of DETACHED_PROCESS, CREATE_NEW_CONSOLE, CREATE_NO_WINDOW

    >python -c "import subprocess; subprocess.getstatusoutput('ā', creationflags=0x00000008)"


this also works correctly with console's encodings: 775, 1257, 65001

    >python -c "from distutils import _msvccompiler; _msvccompiler._get_vc_env('')"



and finally 

   > chcp 1257
   > python -c "import subprocess; print(subprocess.check_output('quser', encoding='cp775'))"
    USERNAME              SESSIONNAME
    dāvis                 console

also works correctly with any of console's encoding even if it didn't showed correct encoding inside cmd itself.
History
Date User Action Args
2016-06-04 01:53:29davispuhsetrecipients: + davispuh, paul.moore, vstinner, tim.golden, ezio.melotti, martin.panter, zach.ware, eryksun, steve.dower
2016-06-04 01:53:29davispuhsetmessageid: <1465005209.34.0.503048719983.issue27179@psf.upfronthosting.co.za>
2016-06-04 01:53:29davispuhlinkissue27179 messages
2016-06-04 01:53:28davispuhcreate