Message 118843 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	terry.reedy
Recipients	benjamin.peterson, terry.reedy, vstinner
Date	2010-10-15.22:58:41
SpamBayes Score	2.3500257e-11
Marked as misclassified	No
Message-id	<1287183522.61.0.653674968096.issue10114@psf.upfronthosting.co.za>
In-reply-to

Content
Pardon my ignorance, but given that code.co_filename is a string attribute given as a string, which is to say, unicode in 3.x, I do not see what filesystem encodings, or any other encoding to bytes should really have to do with the attribute. I actually would have expected compile to take your example argument 'abc\uDC80' and paste it onto the code object unchanged. The only issue to me is whether any string should be allowed or only legal-unicode strings. Anything else would seem like a 2.x holdover. If PyBytes_AS_STRING (macro version of PyBytes_AsString) is the implementation of str(bytes_object) (as I would guess from the doc), then as I read your patch, it will produce rather strange 'filenames'. >>> str('abc\uDC80'.encode("utf-8", "surrogateescape")) "b'abc\\x80'" always wrapped in b'...'. If not that, what does it do (with no decoding specified)?

Pardon my ignorance, but given that code.co_filename is a string attribute given as a string, which is to say, unicode in 3.x, I do not see what filesystem encodings, or any other encoding to bytes should really have to do with the attribute. I actually would have expected compile to take your example argument 'abc\uDC80' and paste it onto the code object unchanged. The only issue to me is whether any string should be allowed or only legal-unicode strings. Anything else would seem like a 2.x holdover.

If PyBytes_AS_STRING (macro version of PyBytes_AsString) is the implementation of str(bytes_object) (as I would guess from the doc), then as I read your patch, it will produce rather strange 'filenames'.
>>> str('abc\uDC80'.encode("utf-8", "surrogateescape"))
"b'abc\\x80'"
always wrapped in b'...'.

If not that, what does it do (with no decoding specified)?

History
Date	User	Action	Args
2010-10-15 22:58:42	terry.reedy	set	recipients: + terry.reedy, vstinner, benjamin.peterson
2010-10-15 22:58:42	terry.reedy	set	messageid: <1287183522.61.0.653674968096.issue10114@psf.upfronthosting.co.za>
2010-10-15 22:58:41	terry.reedy	link	issue10114 messages
2010-10-15 22:58:41	terry.reedy	create