This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author davispuh
Recipients davispuh, eryksun, ezio.melotti, martin.panter, paul.moore, steve.dower, tim.golden, vstinner, zach.ware
Date 2016-06-02.17:08:06
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1464887286.86.0.463928107843.issue27179@psf.upfronthosting.co.za>
In-reply-to
Content
There is right encoding, it's encoding that's actually used. Here we're inside subprocess.Popen which does the actual winapi.CreateProcess call and thus we can check for any creationflags and adjust encoding logic accordingly. I would say almost all Windows console programs does use console's encoding for input/output because otherwise user wouldn't be able to read it. And programs which use different encoding it would be caller's responsibly to set used encoding because only it could know which encoding to use for that program.

So I think basically Popen should accept encoding parameter which would be then passed to TextIOWrapper. Preferably way to set different encoding for stdin/out/err

and then if there's no encoding specified, we use our logic to determine default encoding which would be by using _Py_device_encoding(fd) and this would be right for almost all if not all cases. And if some program changes console's encoding after we got consoles encoding, we could get encoding again after program's execution and then use this new set console's encoding.


Anyway while looking more into this I found why we get wrong encoding.

looking at subprocess.check_output can see

return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
           **kwargs).stdout

that stdout is set to PIPE

and then in subprocess.__init__

if c2pread != -1:
    self.stdout = io.open(c2pread, 'rb', bufsize)
    if universal_newlines:
        self.stdout = io.TextIOWrapper(self.stdout)


there c2pread will be fd for pipe (3)

when looking inside _io_TextIOWrapper___init___impl

fileno = _PyObject_CallMethodId(buffer, &PyId_fileno, NULL);
[...]
int fd = _PyLong_AsInt(fileno);
[...]
self->encoding = _Py_device_encoding(fd);
[...]


we'll set encoding with _Py_device_encoding(3);
but there

    if (fd == 0)
        cp = GetConsoleCP();
    else if (fd == 1 || fd == 2)
        cp = GetConsoleOutputCP();
    else
        cp = 0;


so encoding would be correct for stdin/stdout/stderr but not for pipe and that's why this issue.

I see 2 ways to fix this and I've added patches for both options.
History
Date User Action Args
2016-06-02 17:08:06davispuhsetrecipients: + davispuh, paul.moore, vstinner, tim.golden, ezio.melotti, martin.panter, zach.ware, eryksun, steve.dower
2016-06-02 17:08:06davispuhsetmessageid: <1464887286.86.0.463928107843.issue27179@psf.upfronthosting.co.za>
2016-06-02 17:08:06davispuhlinkissue27179 messages
2016-06-02 17:08:06davispuhcreate