Author serhiy.storchaka
Recipients ezio.melotti, fdrake, pitrou, serhiy.storchaka
Date 2013-09-27.11:05:09
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1380279910.92.0.349060015388.issue19100@psf.upfronthosting.co.za>
In-reply-to
Content
Currently pprint.pprint() fails on unencodable characters.

$ LANG=en_US.utf8 ./python -c "import pprint; pprint.pprint('\u20ac')"
'€'
$ LANG= ./python -c "import pprint; pprint.pprint('\u20ac')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/serhiy/py/cpython/Lib/pprint.py", line 56, in pprint
    printer.pprint(object)
  File "/home/serhiy/py/cpython/Lib/pprint.py", line 137, in pprint
    self._format(object, self._stream, 0, 0, {}, 0)
  File "/home/serhiy/py/cpython/Lib/pprint.py", line 274, in _format
    write(rep)
UnicodeEncodeError: 'ascii' codec can't encode character '\u20ac' in position 1: ordinal not in range(128)

This is a regression from Python 2 in which repr() always returns ascii string.

$ LANG= python2.7 -c "import pprint; pprint.pprint(u'\u20ac')"
u'\u20ac'

Perhaps pprint() should use the backslashreplace error handler (as sys.displayhook()). With the proposed patch:

$ LANG= ./python -c "import pprint; pprint.pprint('\u20ac')"
'\u20ac'
History
Date User Action Args
2013-09-27 11:05:10serhiy.storchakasetrecipients: + serhiy.storchaka, fdrake, pitrou, ezio.melotti
2013-09-27 11:05:10serhiy.storchakasetmessageid: <1380279910.92.0.349060015388.issue19100@psf.upfronthosting.co.za>
2013-09-27 11:05:10serhiy.storchakalinkissue19100 messages
2013-09-27 11:05:10serhiy.storchakacreate