Author terry.reedy
Recipients JBernardo, Ramchandra Apte, Rosuav, William.Schwartz, asvetlov, ezio.melotti, ned.deily, python-dev, roger.serwy, serhiy.storchaka, terry.reedy
Date 2013-08-06.21:53:58
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1375826038.87.0.733196014777.issue13153@psf.upfronthosting.co.za>
In-reply-to
Content
Byte 0, not byte 1, is the start byte, and it should be F0, as in output below. However, I now see "invalid continuation byte'.
In 2.7.5,
# -*- coding: utf-8 -*-
s = b'𐒢'  # output same if uncomment following lines
#s = u'𐒢'.encode('utf-8')  # '𐒢' pasted in from 1st post
#s = u'\U000104a2'.encode('utf-8')  
print(len(s))
for c in s: print(ord(c), hex(ord(c)))
>>> 
4
(240, '0xf0')
(144, '0x90')
(146, '0x92')
(162, '0xa2')

I have no idea how the second pasted byte becomes ED in 3.x.

Attempting to open the file in 3.x results in a broken* 'Untitled' edit window and the following error message in the console.
_tkinter.TclError: character U+104a2 is above the range (U+0000-U+FFFF) allowed by Tcl

* Attempting to close the window either immediately or after entering text results in
AttributeError: 'PyShellEditorWindow' object has no attribute 'extensions'
I have to close the initial python process to get rid of it.
History
Date User Action Args
2013-08-06 21:53:58terry.reedysetrecipients: + terry.reedy, ned.deily, ezio.melotti, roger.serwy, asvetlov, python-dev, JBernardo, Rosuav, Ramchandra Apte, serhiy.storchaka, William.Schwartz
2013-08-06 21:53:58terry.reedysetmessageid: <1375826038.87.0.733196014777.issue13153@psf.upfronthosting.co.za>
2013-08-06 21:53:58terry.reedylinkissue13153 messages
2013-08-06 21:53:58terry.reedycreate