This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients ezio.melotti, jaraco, lemburg, loewis, vstinner
Date 2014-07-11.14:27:46
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1405088866.54.0.49745112173.issue21927@psf.upfronthosting.co.za>
In-reply-to
Content
See also issues #1602 (Windows console) and #16587 (stdin, _setmode() and wprintf).

I tried msvcrt.setmode(0, 0x40000): set stdin mode to _O_U8TEXT. In this mode, echo "abc"|python -c "import sys; print(ascii(sys.stdin.read()))" displays "\xff\xfea\x00b\x00c\x00\n\x00" which is "abc" encoded to UTF-16 (little endian with the BOM),  b'\xff\xfe' is the Unicode BOM U+FEFF (u'\uFEFF') encoded to UTF-16-LE. U+FEFF encoded to UTF-8 gives b'\xef\xbb\xbf'.

So it looks like it's not an issue of the stdin mode. I tried all modes and I always get the Unicode BOM.
History
Date User Action Args
2014-07-11 14:27:46vstinnersetrecipients: + vstinner, lemburg, loewis, jaraco, ezio.melotti
2014-07-11 14:27:46vstinnersetmessageid: <1405088866.54.0.49745112173.issue21927@psf.upfronthosting.co.za>
2014-07-11 14:27:46vstinnerlinkissue21927 messages
2014-07-11 14:27:46vstinnercreate