Message111574
> Victor, This looks like your cup of tee.
Unicode is my cup of tee, but not programs considering that bytes are characters.
<a byte string>.isalpha() doesn't mean anything to me :-)
This issue is a more question about the C library, not about Python :-) So try the attached program "isalpha.c" if you would like to test your libc.
Results on my Linux box (Debian Sid, eglibc 2.11.2):
----------------
$ ./isalpha C
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz (52)
$ ./isalpha fr_FR.UTF-8
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz (52)
$ ./isalpha fr_FR.iso88591
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\xaa\xb5\xba\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff (117)
$ ./isalpha fr_FR.iso885915@euro
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\xa6\xa8\xaa\xb4\xb5\xb8\xba\xbc\xbd\xbe\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff (124)
----------------
If your libc consider that \xff is a valid UTF-8 character, you should change your OS for a better one :-)
--
> >>> len(letters)
> 117
> ...
> >>> locale.setlocale(locale.LC_CTYPE)
> 'en_US.UTF-8'
It looks like Mac OS X uses ISO-8859-1 instead of UTF-8.
--
string.letters is built using strop.lowercase + strop.uppsercase which are built using the C functions islower() and islower(). locale.setlocale() regenerates strop/string.lowercase, strop/string.uppercase and string.letters for LC_CTYPE and LC_ALL categories.
--
You don't need to run IDLE or import Tkinter to set the locale:
import locale; locale.setlocale(locale.LC_ALL, '')
is enough.
--
A library should not change the locale (only the application).
$ python2.6
>>> import locale
>>> locale.getlocale()
(None, None)
>>> import Tkinter
>>> locale.getlocale()
('fr_FR', 'UTF8')
=> Tkinter is an horrible library! (The bug is in the C library, not in the Python wrapper)
Use a better one like Gtk ou Qt ;-)
$ python
>>> import locale
>>> import pygtk
>>> locale.getlocale()
(None, None)
>>> import PyQt4
>>> locale.getlocale()
(None, None)
(IDLE is based on Tkinter)
--
I don't understand why Alexander gets different results on Python 2.6 and Python 2.7.
@belopolsky: Are both programs linked to (built with?) the same C library? (same libray version) |
|
Date |
User |
Action |
Args |
2010-07-25 23:27:21 | vstinner | set | recipients:
+ vstinner, loewis, ronaldoussoren, mark.dickinson, belopolsky, eric.smith, jkloth, eric.araujo, antlong |
2010-07-25 23:27:21 | vstinner | set | messageid: <1280100441.33.0.101426017348.issue9335@psf.upfronthosting.co.za> |
2010-07-25 23:27:20 | vstinner | link | issue9335 messages |
2010-07-25 23:27:18 | vstinner | create | |
|