This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Strings undisplayable with repr
Type: Stage:
Components: Interpreter Core, macOS, Windows Versions: Python 3.0
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: Nosy List: loewis, michael.foord
Priority: normal Keywords:

Created on 2008-12-08 21:08 by michael.foord, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (5)
msg77341 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2008-12-08 21:08
In Python 3 strings with non-ascii characters are undisplayable (even
with repr) in the default interactive interpreter on Windows and Mac.
Shouldn't the repr use escapes as with previous versions of Python?

Python 2.6
Python 2.6 (trunk:66714:66715M, Oct  1 2008, 18:36:04) 
[GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> d = u'\u20ac'
>>> d
u'\u20ac'

Python 3
Python 3.0 (r30:67503, Dec  6 2008, 21:14:27) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> d = '\u20ac'
>>> d
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File
"/Library/Frameworks/Python.framework/Versions/3.0/lib/python3.0/io.py",
line 1491, in write
    b = encoder.encode(s)
  File
"/Library/Frameworks/Python.framework/Versions/3.0/lib/python3.0/encodings/ascii.py",
line 22, in encode
    return codecs.ascii_encode(input, self.errors)[0]
UnicodeEncodeError: 'ascii' codec can't encode character '\u20ac' in
position 1: ordinal not in range(128)
>>> repr(d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File
"/Library/Frameworks/Python.framework/Versions/3.0/lib/python3.0/io.py",
line 1491, in write
    b = encoder.encode(s)
  File
"/Library/Frameworks/Python.framework/Versions/3.0/lib/python3.0/encodings/ascii.py",
line 22, in encode
    return codecs.ascii_encode(input, self.errors)[0]
UnicodeEncodeError: 'ascii' codec can't encode character '\u20ac' in
position 2: ordinal not in range(128)
>>>
msg77362 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-12-08 23:22
This specific character shouldn't be undisplayable on Windows, not even
on your version of Windows. Instead, Python should find out how to
convert it to your terminal's encoding, rather than using ASCII.

It's a bug that the OSX port using ASCII; it probably should use UTF-8
by default for the terminal encoding.

In any case, it is *not* a bug that Python doesn't escape the character.
If you want it escaped, use the ascii() builtin function.
msg77365 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2008-12-08 23:38
Hmmm... nope - my terminal encoding (according to Python) on WIndows
Vista x64 (with 32bit Python and a vanilla cmd) is cp850.

C:\compile>C:\Python30\python.exe
Python 3.0 (r30:67507, Dec  3 2008, 20:14:27) [MSC v.1500 32 bit
(Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> '\u20ac'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python30\lib\io.py", line 1491, in write
    b = encoder.encode(s)
  File "C:\Python30\lib\encodings\cp850.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u20ac' in
position
1: character maps to <undefined>

Oh well. (The traceback I posted was from Max OS X 10.5 - I didn't dig
closely enough to see that a different encoding was being used on
Windows, my bad - and thanks for the tip about ascii() ).
msg77366 - (view) Author: Michael Foord (michael.foord) * (Python committer) Date: 2008-12-08 23:40
OK - and as a further follow up, the traceback I posted for Mac OS X was
from a terminal called iTerm. I tried the same thing on the standard
'Terminal' app and it prints the Euro symbol fine. Looks like iTerm is
an ascii terminal.

Sorry for the noise, I think this can be closed.
msg77370 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-12-08 23:46
> Hmmm... nope - my terminal encoding (according to Python) on WIndows
> Vista x64 (with 32bit Python and a vanilla cmd) is cp850.

Ah, ok. That doesn't have the euro sign, either. Try it in IDLE, though.
History
Date User Action Args
2022-04-11 14:56:42adminsetgithub: 48849
2008-12-08 23:46:54loewissetstatus: open -> closed
resolution: works for me
2008-12-08 23:46:32loewissetmessages: + msg77370
2008-12-08 23:40:52michael.foordsetmessages: + msg77366
2008-12-08 23:38:02michael.foordsetmessages: + msg77365
2008-12-08 23:22:12loewissetnosy: + loewis
messages: + msg77362
2008-12-08 21:08:25michael.foordcreate