Message 115914 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	pitrou, vstinner
Date	2010-09-08.22:50:56
SpamBayes Score	0.0050567454
Marked as misclassified	No
Message-id	<1283986258.88.0.514374492661.issue9804@psf.upfronthosting.co.za>
In-reply-to

Content
For unicode, ascii(x) is implemented as repr(x).encode('ascii', 'backslashreplace').decode('ascii'). repr(x) is "'" + x + "'" for printable characters (eg. U+1D121), and "'U+%08x'" % ord(x) for not printable characters (eg. U+12FFF). About the unexpected output, the problem is that ascii+backslashreplace encodes non-BMP printable characters as b'\\uXXXX\\uXXXX' in narrow builds. I don't see simple solution to encode non-BMP characters as b'\\UXXXXXXXX' because the principle of error handler is that it escapes non encodable characters one by one.

For unicode, ascii(x) is implemented as repr(x).encode('ascii', 'backslashreplace').decode('ascii').

repr(x) is "'" + x + "'" for printable characters (eg. U+1D121), and "'U+%08x'" % ord(x) for not printable characters (eg. U+12FFF).

About the unexpected output, the problem is that ascii+backslashreplace encodes non-BMP printable characters as b'\\uXXXX\\uXXXX' in narrow builds.

I don't see simple solution to encode non-BMP characters as b'\\UXXXXXXXX' because the principle of error handler is that it escapes non encodable characters one by one.

History
Date	User	Action	Args
2010-09-08 22:50:59	vstinner	set	recipients: + vstinner, pitrou
2010-09-08 22:50:58	vstinner	set	messageid: <1283986258.88.0.514374492661.issue9804@psf.upfronthosting.co.za>
2010-09-08 22:50:56	vstinner	link	issue9804 messages
2010-09-08 22:50:56	vstinner	create