Created on 2012-12-01 00:23 by makegho, last changed 2014-06-23 09:59 by makegho. This issue is now closed.
|msg176732 - (view)||Author: Markus Kettunen (makegho)||Date: 2012-12-01 00:23|
In a C application on Windows, at least on MSVC 2010 and Windows 7, do this: wprintf(L"Test\n"); Py_Initialize(); wprintf(L"Test\n"); Output is: Test T e s t I was able to track the issue to fileio.c to the following code block by searching where wprintf breaks: if (dircheck(self, nameobj) < 0) goto error; #if defined(MS_WINDOWS) || defined(__CYGWIN__) /* don't translate newlines (\r\n <=> \n) */ _setmode(self->fd, O_BINARY); <----- breaks it #endif if (PyObject_SetAttrString((PyObject *)self, "name", nameobj) < 0) goto error; This can be easily confirmed by adding wprintfs on both sides of _setmode. This issue was also raised at http://mail.python.org/pipermail/python-list/2012-February/620528.html but no solution was provided back then.
|msg176734 - (view)||Author: Markus Kettunen (makegho)||Date: 2012-12-01 00:47|
If the standard streams are not used through Python, this hack can be used to work around the bug on C side: #ifdef WIN32 #include <fcntl.h> #endif ... Py_Initialize(); #ifdef WIN32 _setmode(stdin->_file, O_TEXT); _setmode(stdout->_file, O_TEXT); _setmode(stderr->_file, O_TEXT); #endif
|msg176828 - (view)||Author: STINNER Victor (haypo) *||Date: 2012-12-03 07:23|
_setmode(self->fd, O_BINARY) change was done in Python 3.2: see the issue #10841. This change introduced regressions: - #11272: "input() has trailing carriage return on windows", fixed in Python 3.2.1 - #11395: "print(s) fails on Windows with long strings", fixed in Python 3.2.1 - #13119: "Newline for print() is \n on Windows, and not \r\n as expected", fixed in Python 3.3 (and will be fixed in Python 3.2.4) In Python 3.1, _setmode(self->fd, O_BINARY) was already used when Python is called with the -u command line option. _setmode() supports different options: - _O_BINARY: no conversion - _O_TEXT: translate "\n" with "\r\n" - _O_U8TEXT: UTF-8 without BOM - _O_U16TEXT: UTF-16 without BOM - _O_WTEXT: UTF-16 with BOM I didn't try wprintf(). This function is not used in the Python source code (except in the Windows launcher, which is not part of the main interpreter). I don't know how to fix wprintf().
|msg176837 - (view)||Author: STINNER Victor (haypo) *||Date: 2012-12-03 11:49|
> _setmode(self->fd, O_BINARY) change was done in Python 3.2: see the issue #10841 The main reason was to be able to read binary file from sys.stdin using the CGI module: see the issue #4953. In _O_TEXT mode, 0x0A byte is replaced with 0x0A 0x0D (or the opposite, I never remember) which corrupt binary files. Articles about _setmode() and wprintf(): "A confluence of circumstances leaves a stone unturned..." http://blogs.msdn.com/b/michkap/archive/2010/09/23/10066660.aspx "Conventional wisdom is retarded, aka What the @#%&* is _O_U16TEXT?" http://blogs.msdn.com/b/michkap/archive/2008/03/18/8306597.aspx See also issue #1602 (Windows console doesn't print or input Unicode).
|msg186852 - (view)||Author: John Ehresman (jpe) *||Date: 2013-04-13 21:16|
One way to fix this is to use the FileRead & FileWrite api functions directly as proposed in issue 17723 I would regard this as a change in behavior and not a simple bug fix because there is probably code written for 3.3 that assumes the C level stdout is in binary after python is initialized so would target 3.4 for the change.
|msg220642 - (view)||Author: Mark Lawrence (BreamoreBoy) *||Date: 2014-06-15 14:40|
I'll let our Windows gurus fight over who gets this one :)
|msg220762 - (view)||Author: STINNER Victor (haypo) *||Date: 2014-06-16 20:57|
If I understood correctly, supporting the "wide mode" for wprintf() requires to modify all calls to functions like printf() in Python and so it requires to change a lot of code. Since this issue was the first time that I heard about wprintf(), I don't think that we should change Python. I'm not going to fix this issue except if much more users ask for it.
|msg221346 - (view)||Author: Markus Kettunen (makegho)||Date: 2014-06-23 09:38|
It's quite common to use wide character strings to support Unicode in C and C++. In C++ this often means using std::wstring and std::wcout. Maybe these are more common than wprintf? In any case the console output breaks as Py_Initialize hijacks the host application's standard output streams which sounds quite illegitimate to me. I understand that Python isn't designed for embedding and it would be a lot of work to fix it, but I would still encourage everyone to take a look at this bug. For me, this was one of the reasons I ultimately had to decide against using Python as my application's scripting language, which is a shame.
|msg221347 - (view)||Author: STINNER Victor (haypo) *||Date: 2014-06-23 09:48|
"In C++ this often means using std::wstring and std::wcout. Maybe these are more common than wprintf? In any case the console output breaks as Py_Initialize hijacks the host application's standard output streams which sounds quite illegitimate to me." On Linux, std::wcout doesn't use wprintf(). Do you mean that std::wcout also depends on the "mode" of stdout (_setmode)?
|msg221348 - (view)||Author: Markus Kettunen (makegho)||Date: 2014-06-23 09:59|
> On Linux, std::wcout doesn't use wprintf(). Do you mean that std::wcout also depends on the "mode" of stdout (_setmode)? Yes, exactly. I originally noticed this bug by using std::wcout on Windows.
|2014-06-23 09:59:21||makegho||set||messages: + msg221348|
|2014-06-23 09:48:05||haypo||set||messages: + msg221347|
|2014-06-23 09:38:19||makegho||set||messages: + msg221346|
|2014-06-16 20:57:20||haypo||set||status: open -> closed|
resolution: wont fix
messages: + msg220762
+ tim.golden, BreamoreBoy, zach.ware, steve.dower|
messages: + msg220642
messages: + msg186852
|2012-12-03 11:49:58||haypo||set||messages: + msg176837|
|2012-12-03 07:23:52||haypo||set||messages: + msg176828|
|2012-12-01 00:47:50||makegho||set||messages: + msg176734|