Author steve.dower
Recipients Drekin, barry, berker.peksag, bkabrda, brett.cannon, martin.panter, ncoghlan, petr.viktorin, rkuska, steve.dower
Date 2015-11-16.18:27:00
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
Right now all of the tests fail on Windows by default (cp437 for me).

If I change the default IO encoding to utf-8 (hacked into pylifecycle.c, since PYTHONIOENCODING is ignored by subprocesses using -E), the four "Misconfigured" tests crash at the os.fsencode() call (as "mbcs:strict" cannot encode the characters - this may be a real issue, haven't dug into it yet).

Adding more hacks to get past this point brings me back into the ASCII encoding performed by the test, and I'm not sure whether that's just an incorrect assumption for Windows or not.

Separate issue: if I run "chcp 437" before the tests, the output is garbage. If I run "chcp 65001" then it shows the characters in the font correctly. The std streams encoding is taken from this value, but it doesn't map back to UTF-8, which is probably another issue. If I add a separate check in fileutils.c at _Py_device_encoding then I get UTF-8 enabled streams when the console is set for cp65001.

However, there are still a number of places that use GetACP() to determine the locale and encoding to use, which is incorrect for Unicode-aware programs. In particular, this should not happen:

>>> f=open('test.txt', 'w')
>>> f.encoding

There's no good reason for the default encoding to not be UTF-8 these days, but this is a much bigger change. It's probably worth doing for 3.6, but may need more discussion...
Date User Action Args
2015-11-16 18:27:01steve.dowersetrecipients: + steve.dower, barry, brett.cannon, ncoghlan, petr.viktorin, berker.peksag, martin.panter, bkabrda, Drekin, rkuska
2015-11-16 18:27:01steve.dowersetmessageid: <>
2015-11-16 18:27:01steve.dowerlinkissue22555 messages
2015-11-16 18:27:00steve.dowercreate