Unicode problem in Tkinter under Windows 2000
Keyboard-entered chars in ascii range > 128 mess up
internal unicode encoding in text-widget leading to
unicode errors
The following example should reproduce the bug:
>>> import Tkinter
>>> t=Tkinter.Text()
>>> t.pack()
>>> t.insert("1.0",u'\xe2\xee\xfb')
Now set the focus to the text-widget and via the
keyboard enter an a umlaut into the text-widget
(alternatively press ALT and enter 0228 on the Numpad
of your Keyboard to simulate this)
Then test the result:
>>> t.get("1.0","end")
u'\xe2\xee\xfb\xe4\n'
This is what you get under Linux (I was told) and what
it should be.
However, under Windows 2000 I get:
'\xc3\xa2\xc3\xae\xc3\xbb\xe4\n'
which is a mixture of UTF-8 and cp1252(?) leading to
an Unicode-error, if I try e.g. to save it as a file.
(All characters of an 8-bit value > 128 (e.g. latin-1
or cp1252) entered via keyboard into a text-widget
cause such a weird behaviour, not just the a umlaut.)
A simple workaround (not thoroughly tested) could look
like this:
def badkey(self, event):
try:
if ord(event.char) > 127:
txt.insert("insert", unicode
(event.char,"cp1252"))
return "break"
except:
pass
`txt` being the instance of a text-widget, that is
bound to a callback for the key-press-event:
if sys.platform == "win32":
txt.bind("<KeyPress>",badkey)
|