This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients pitrou, vstinner
Date 2010-09-08.22:50:56
SpamBayes Score 0.0050567454
Marked as misclassified No
Message-id <1283986258.88.0.514374492661.issue9804@psf.upfronthosting.co.za>
In-reply-to
Content
For unicode, ascii(x) is implemented as repr(x).encode('ascii', 'backslashreplace').decode('ascii').

repr(x) is "'" + x + "'" for printable characters (eg. U+1D121), and "'U+%08x'" % ord(x) for not printable characters (eg. U+12FFF).

About the unexpected output, the problem is that ascii+backslashreplace encodes non-BMP printable characters as b'\\uXXXX\\uXXXX' in narrow builds.

I don't see simple solution to encode non-BMP characters as b'\\UXXXXXXXX' because the principle of error handler is that it escapes non encodable characters one by one.
History
Date User Action Args
2010-09-08 22:50:59vstinnersetrecipients: + vstinner, pitrou
2010-09-08 22:50:58vstinnersetmessageid: <1283986258.88.0.514374492661.issue9804@psf.upfronthosting.co.za>
2010-09-08 22:50:56vstinnerlinkissue9804 messages
2010-09-08 22:50:56vstinnercreate