This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Tomoki.Imai
Recipients Tomoki.Imai, ezio.melotti, pradyunsg, r.david.murray, terry.reedy
Date 2013-04-21.23:19:42
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1366586382.36.0.964994952434.issue17348@psf.upfronthosting.co.za>
In-reply-to
Content
Thanks.

I noticed Terry used python3 to confirm this problem...

I am Japanese, but using English environment.
Here is my locale settings. And I'm using Linux.
konomi:tomoki% locale                                    
LANG=en_US.utf8
LC_CTYPE=en_US.UTF-8
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE="en_US.utf8"
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=

All strings used internally should be unicode type.
In Japan, many many charset is here.(cp932,euc-jp,...).
And, they causes problems in Python2 without converting it to unicode type.
Remember, unicode type and "utf-8" is not same.

When I type into Tkinter's Entry and get Entry's value,it returned me unicode.
And deleted code converts unicode to str type.
They are unified in Python3.(unicode become str,and str become byte).
So, these lines are not in Python3 codes.

I typed these strings using "Input Method"(am using uim).
https://code.google.com/p/uim/
But, I don't know how uim generate these characters.
History
Date User Action Args
2013-04-21 23:19:42Tomoki.Imaisetrecipients: + Tomoki.Imai, terry.reedy, ezio.melotti, r.david.murray, pradyunsg
2013-04-21 23:19:42Tomoki.Imaisetmessageid: <1366586382.36.0.964994952434.issue17348@psf.upfronthosting.co.za>
2013-04-21 23:19:42Tomoki.Imailinkissue17348 messages
2013-04-21 23:19:42Tomoki.Imaicreate