Message318807
In subprocess, the implementation of shell=True on Windows is to launch a subprocess with using {comspec} /c "{args}" (normally comspec=cmd.exe).
By default, the output of cmd is encoded with the "active" codepage. In Python 3.6, you can decode this using encoding='oem'.
However, this actually loses information. For example, try creating a file with a filename in a language that is not your active codepage, and then doing subprocess.check_output('dir', shell=True). In the output, the filename is replaced with question marks (not by Python, by cmd!).
To get the correct output, cmd has a "/u" switch (this switch has probably existed forever - at least since Windows NT 4.0, by my internet search). The output can then be decoded using encoding='utf-16-le', like any native Windows string.
Currently, Popen constructs the command line in this hardcoded format: {comspec} /c "{args}", so you can't get the /u in there with the shell=True shortcut, and have to write your own wrapping code.
I suggest adding an feature to Popen where /u may be inserted before the /c within the shell=True shortcut. I've thought of several ways to implement this:
1. A new argument to Popen, which indicates that we want Unicode shell output; if True, add the /u. Note that we already have a couple of Windows-only arguments to Popen, so this would not be a precedent.
2. If the encoding argument is 'utf-16-le' or one of its aliases, then add the /u.
3. If the encoding argument is not None, then add the /u. |
|
Date |
User |
Action |
Args |
2018-06-06 10:01:53 | Yoni Rozenshein | set | recipients:
+ Yoni Rozenshein |
2018-06-06 10:01:53 | Yoni Rozenshein | set | messageid: <1528279313.05.0.592728768989.issue33780@psf.upfronthosting.co.za> |
2018-06-06 10:01:52 | Yoni Rozenshein | link | issue33780 messages |
2018-06-06 10:01:52 | Yoni Rozenshein | create | |
|