classification
Title: Better support for pipe I/O encoding in subprocess
Type: Stage: resolved
Components: Documentation Versions: Python 3.3
process
Status: closed Resolution: duplicate
Dependencies: Superseder: subprocess seems to use local encoding and give no choice
View: 6135
Assigned To: Nosy List: docs@python, ncoghlan, vstinner
Priority: normal Keywords:

Created on 2011-11-21 01:22 by ncoghlan, last changed 2011-11-21 01:32 by ncoghlan. This issue is now closed.

Messages (3)
msg148022 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-11-21 01:22
Currently, pipes in the subprocess module work strictly with bytes I/O, *unless* you set "universal newlines=True". In that case, it assumes an output encoding of UTF-8 for stdout and stderr and applies universal newlines process.

When stdin/out/err are remapped to ordinary I/O streams then 'encoding' and 'errors' can be specified as usual, but it is currently challenging to do this for pipes. Since they're created internally by the subprocess module, user code doesn't get the opportunity to wrap them when using the convenience APIs. When using Popen objects, you have to create the object, then wrap each stream individually (rebinding the attributes as you go).

My suggestion is that we add a new option for the stdin/out/err arguments:

    class TextPipe:
        def __init__(self, encoding, errors='strict'):
            self.encoding = encoding
            self.errors = errors

So to read UTF-8 encoded data from a subprocess, you could just do:

    data = check_stdout(cmd, stdout=TextPipe('utf-8'), stderr=STDOUT)

There are at least a couple of other alternatives here:

- separate out the pipe creation logic from the Popen logic so it is possible to create and wrap the pipe objects explicitly and then pass the wrapped pipe object to the subprocess invocation APIs. 'TextPipe' would then actually be such a wrapped pipe, rather than merely instructions to tell Popen what kind of pipe to create.
- instead of adding 'TextPipe', just re-use the PIPE name (with the class itself still being used as a marker constant to request implicit creation of a binary PIPE)
msg148023 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-11-21 01:25
This issue looks as a duplicate of #6135.
msg148024 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-11-21 01:32
Indeed, I'll add my suggestions over there.
History
Date User Action Args
2011-11-21 01:32:25ncoghlansetstatus: open -> closed
superseder: subprocess seems to use local encoding and give no choice
messages: + msg148024

assignee: docs@python ->
resolution: duplicate
stage: needs patch -> resolved
2011-11-21 01:25:26vstinnersetnosy: + vstinner
messages: + msg148023
2011-11-21 01:22:48ncoghlancreate