Message260153
I've come across an issue of Python 3.5.1 appending an extra newline when print()ing non-ASCII strings on Windows.
This only happens when the active "code page" is set UTF-8 in cmd.exe:
>chcp
Active code page: 65001
Now, if I try to print an ASCII character (e.g. LATIN CAPITAL LETTER A), everything works fine:
>python -c "print(chr(0x41))"
A
>
But if I try to print something a little less common (GREEK CAPITAL LETTER ALPHA), something weird happens:
>python -c "print(chr(0x391))"
Α
>
For another example, let's try to print CYRILLIC CAPITAL LETTER A:
>python -c "print(chr(0x410))"
А
>
This only happens if the current code page is UTF-8 though.
If I change it to something that can represent those characters, everything seems to be working fine.
For example, the Greek letter:
>chcp 1252
Active code page: 1253
>python -c "print(chr(0x391))"
Α
>
And the Cyrillic letter:
>chcp 1251
Active code page: 1251
>python -c "print(chr(0x410))"
А
>
This also happens if one tries to print a string with a funny character somewhere in it. Sometimes it's even worse:
>python -c "print('Привет!')"
Привет!
�т!
>
Look, guys, I know what a mess Unicode handling on Windows is, and I'm not even sure it's Python's fault, I just wanted to make sure I'm not delusional and not making stuff up.
Can somebody at least confirm this? Thank you.
I'm using x86-64 version of Python 3.5.1 on Windows 8.1. |
|
Date |
User |
Action |
Args |
2016-02-12 01:06:05 | Egor Tensin | set | recipients:
+ Egor Tensin, paul.moore, vstinner, tim.golden, ezio.melotti, zach.ware, steve.dower |
2016-02-12 01:06:05 | Egor Tensin | set | messageid: <1455239165.25.0.227087852631.issue26345@psf.upfronthosting.co.za> |
2016-02-12 01:06:05 | Egor Tensin | link | issue26345 messages |
2016-02-12 01:06:04 | Egor Tensin | create | |
|