Message291317
In Windows IDLE 3.x, you should still be able to print a surrogate transcoding, which sneaks the native UTF-16LE encoding around tkinter:
def transurrogate(s):
b = s.encode('utf-16le')
return ''.join(b[i:i+2].decode('utf-16le', 'surrogatepass')
for i in range(0, len(b), 2))
def print_surrogate(*args, **kwds):
new_args = []
for arg in args:
if isinstance(arg, str):
new_args.append(transurrogate(s))
else:
new_args.append(arg)
return print(*new_args, **kwds)
>>> s = '\U0001f52b \U0001f52a'
>>> print_surrogate(s)
🔫 🔪
Pasting non-BMP text into IDLE fails on Windows for a similar reason. Tk naively encodes the surrogate codes in the native Windows UTF-16 text as invalid UTF-8, which I've seen refereed to as WTF-8 (Wobbly). I see the following error when I run IDLE using python.exe (i.e. with a console) and paste "🔫 🔪" into the window:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 1: invalid continuation byte
This is the second byte of the WTF-8 encoding:
>>> transurrogate('"\U0001f52b').encode('utf-8', 'surrogatepass')
b'"\xed\xa0\xbd\xed\xb4\xab'
Hackiness aside, I don't think it's worth supporting this just for Windows. |
|
Date |
User |
Action |
Args |
2017-04-08 04:23:47 | eryksun | set | recipients:
+ eryksun, terry.reedy, David E. Franco G. |
2017-04-08 04:23:47 | eryksun | set | messageid: <1491625427.57.0.233862121021.issue30019@psf.upfronthosting.co.za> |
2017-04-08 04:23:47 | eryksun | link | issue30019 messages |
2017-04-08 04:23:47 | eryksun | create | |
|