Author izbyshev
Recipients izbyshev, paul.moore, steve.dower, tim.golden, u36959, vstinner, zach.ware
Date 2020-12-22.02:02:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1608602537.95.0.715463893662.issue42707@roundup.psfhosted.org>
In-reply-to
Content
> I've been struggling to understand today why a simple file redirection couldn't work properly today (encoding issues)

The core issue is that "working properly" is not defined in general when we're talking about piping/redirection, as opposed to the console. Different programs that consume Python's output (or produce its input) can have different expectations wrt. data encoding, and there is no way for Python to know it in advance. In your examples, you use programs like "more" and "type" to print the Python's output back to the console, so in this case using the OEM code page would produce the result that you expect. But, for example, in case Python's output was to be consumed by a C program that uses simple `fopen()/wscanf()/wprintf()` to work with text files, the ANSI code page would be appropriate because that's what the Microsoft C runtime library defaults to for wide character operations.

Python has traditionally used the ANSI code page as the default IO encoding for non-console cases (note that Python makes no distinction between non-console `sys.std*` and the builtin `open()` wrt. encoding), and this behavior can't be changed. You can use `PYTHONIOENCODING` or enable the UTF-8 mode[1] to change the default encoding.

Note that in your example you could simply use `PYTHONIOENCODING=cp850`, which would remove the need to use `chcp`.

[1] https://docs.python.org/3/using/cmdline.html#envvar-PYTHONUTF8
History
Date User Action Args
2020-12-22 02:02:17izbyshevsetrecipients: + izbyshev, paul.moore, vstinner, tim.golden, zach.ware, steve.dower, u36959
2020-12-22 02:02:17izbyshevsetmessageid: <1608602537.95.0.715463893662.issue42707@roundup.psfhosted.org>
2020-12-22 02:02:17izbyshevlinkissue42707 messages
2020-12-22 02:02:16izbyshevcreate