Message 4489 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	nobody
Recipients
Date	2001-04-23.09:41:05
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
Unicode problem in Tkinter under Windows 2000 Keyboard-entered chars in ascii range > 128 mess up internal unicode encoding in text-widget leading to unicode errors The following example should reproduce the bug: >>> import Tkinter >>> t=Tkinter.Text() >>> t.pack() >>> t.insert("1.0",u'\xe2\xee\xfb') Now set the focus to the text-widget and via the keyboard enter an a umlaut into the text-widget (alternatively press ALT and enter 0228 on the Numpad of your Keyboard to simulate this) Then test the result: >>> t.get("1.0","end") u'\xe2\xee\xfb\xe4\n' This is what you get under Linux (I was told) and what it should be. However, under Windows 2000 I get: '\xc3\xa2\xc3\xae\xc3\xbb\xe4\n' which is a mixture of UTF-8 and cp1252(?) leading to an Unicode-error, if I try e.g. to save it as a file. (All characters of an 8-bit value > 128 (e.g. latin-1 or cp1252) entered via keyboard into a text-widget cause such a weird behaviour, not just the a umlaut.) A simple workaround (not thoroughly tested) could look like this: def badkey(self, event): try: if ord(event.char) > 127: txt.insert("insert", unicode (event.char,"cp1252")) return "break" except: pass `txt` being the instance of a text-widget, that is bound to a callback for the key-press-event: if sys.platform == "win32": txt.bind("<KeyPress>",badkey)

Unicode problem in Tkinter under Windows 2000
Keyboard-entered chars in ascii range > 128 mess up 
internal unicode encoding in text-widget leading to 
unicode errors

The following example should reproduce the bug:

>>> import Tkinter
>>> t=Tkinter.Text()
>>> t.pack()
>>> t.insert("1.0",u'\xe2\xee\xfb')

Now set the focus to the text-widget and via the 
keyboard enter an a umlaut into the text-widget 
(alternatively press ALT and enter 0228 on the Numpad 
of your Keyboard to simulate this)
Then test the result:
>>> t.get("1.0","end")
u'\xe2\xee\xfb\xe4\n'
This is what you get under Linux (I was told) and what 
it should be.
However, under Windows 2000 I get:
'\xc3\xa2\xc3\xae\xc3\xbb\xe4\n'
which is a mixture of UTF-8 and cp1252(?) leading to 
an Unicode-error, if I try e.g. to save it as a file.
(All characters of an 8-bit value > 128 (e.g. latin-1 
or cp1252) entered via keyboard into a text-widget 
cause such a weird behaviour, not just the a umlaut.)

A simple workaround (not thoroughly tested) could look 
like this:

def badkey(self, event):
    try:
        if ord(event.char) > 127:
            txt.insert("insert", unicode
(event.char,"cp1252"))
            return "break"
    except:
        pass

`txt` being the instance of a text-widget, that is 
bound to a callback for the key-press-event:

if sys.platform == "win32":
         txt.bind("<KeyPress>",badkey)

History
Date	User	Action	Args
2007-08-23 13:54:09	admin	link	issue418173 messages
2007-08-23 13:54:09	admin	create