This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eryksun
Recipients eryksun, ezio.melotti, jaraco, lemburg, loewis, r.david.murray, serhiy.storchaka, vstinner
Date 2014-07-16.17:43:07
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1405532588.04.0.277423657798.issue21927@psf.upfronthosting.co.za>
In-reply-to
Content
> PS C:\Users\jaraco> echo £ | py -3 -c "import sys; print(repr(sys.stdin.buffer.read()))"
> b'?\r\n'

> Curiously, it appears as if powershell is actually receiving 
> a question mark from the pipe.

PowerShell calls ReadConsoleW to read the console input buffer, i.e. it reads "£" as a wide character from the command line. The default encoding when writing to the pipe should be ASCII [*]. If that's the case it explains the question mark that Python reads from stdin. It's the default replacement character (WC_DEFAULTCHAR) used by WideCharToMultiByte. 

[*] http://blogs.msdn.com/b/powershell/archive/2006/12/11/outputencoding-to-the-rescue.aspx

You can change PowerShell's output encoding to match the console:

    $OutputEncoding = [Console]::OutputEncoding

If the console codepage is 65001, the above is equivalent to setting 

    $OutputEncoding = [System.Text.Encoding]::UTF8

http://msdn.microsoft.com/en-us/library/system.text.encoding.utf8

As Victor mentioned, this setting always writes a BOM, and under codepage 65001 it actually writes 2 BOMs (at least in PowerShell 2). Victor also mentioned that you can avoid the BOM by passing $False to the constructor:

    $OutputEncoding = New-Object System.Text.UTF8Encoding($False)

http://msdn.microsoft.com/en-us/library/system.text.utf8encoding

There's still a BOM under codepage 65001, but maybe that's fixed in PowerShell 3. 

I avoid setting the console to codepage 65001 anyway. ReadFile/WriteFile incorrectly return the number of characters read/written instead of the number of bytes because the call is actually handled by ReadConsoleA/WriteConsoleA. Maybe that's finally fixed in Windows 8.
History
Date User Action Args
2014-07-16 17:43:08eryksunsetrecipients: + eryksun, lemburg, loewis, jaraco, vstinner, ezio.melotti, r.david.murray, serhiy.storchaka
2014-07-16 17:43:08eryksunsetmessageid: <1405532588.04.0.277423657798.issue21927@psf.upfronthosting.co.za>
2014-07-16 17:43:08eryksunlinkissue21927 messages
2014-07-16 17:43:07eryksuncreate