This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients David.Sankel, amaury.forgeotdarc, christian.heimes, christoph, davidsarah, ezio.melotti, hippietrail, lemburg, mark, pitrou, santoso.wijaya, smerlin, ssbarnea, terry.reedy, tim.golden, tzot, v+python, vstinner
Date 2011-10-19.11:52:56
SpamBayes Score 1.4401758e-11
Marked as misclassified No
Message-id <1319025178.07.0.357866658184.issue1602@psf.upfronthosting.co.za>
In-reply-to
Content
I done more tests on the Windows console. I focused my tests on output.

To sum up, if we implement sys.stdout using WriteConsoleW() and sys.stdout.buffer.raw using WriteConsoleA():

 - print() will not fail anymore on unencodable characters, because the string is no longer encoded to the console code page
 - if you set the console font to a TrueType font, most characters will be displayed correctly
 - you don't need to change the (console) code page to CP_UTF8 (65001) anymore if you just use print()
 - you still need cp65001 if the output (stdout and/or stderr) is redirected or if you use directly sys.stdout.buffer or sys.stderr.buffer

Other facts:

 - locale.getpreferredencoding() returns the ANSI code page
 - sys.stdin.encoding is the console encoding (GetConsoleCP())
 - sys.stdout.encoding and sys.stderr.encoding are the console output code page (GetConsoleOutputCP())
 - sys.stdout is not a TTY if the output is redirect, e.g. "python script.py|more"
 - sys.stderr is not a TTY if the output is redirect, e.g. "python script.py 2>&1|more" (this example redirects stdout and stderr, I don't know how to redirect only stderr)
 - WriteConsoleW() is not affected by the console output code page (GetConsoleOutputCP)
 - WriteConsoleA() is indirectly affected by the console output code page: if a string cannot be encoded to the console output code page (e.g. sys.stdout.encoding), you cannot call WriteConsoleA with the result...
 - If the console font is a raster font and and the font doesn't contain a character, the console tries to find a similar glyph, or it falls back to the character '?'
 - If the console font is a TrueType font, it is able to display most Unicode characters
History
Date User Action Args
2011-10-19 11:52:58vstinnersetrecipients: + vstinner, lemburg, terry.reedy, tzot, amaury.forgeotdarc, pitrou, christian.heimes, tim.golden, mark, christoph, ezio.melotti, v+python, hippietrail, ssbarnea, davidsarah, santoso.wijaya, David.Sankel, smerlin
2011-10-19 11:52:58vstinnersetmessageid: <1319025178.07.0.357866658184.issue1602@psf.upfronthosting.co.za>
2011-10-19 11:52:57vstinnerlinkissue1602 messages
2011-10-19 11:52:56vstinnercreate