classification
Title: IDLE environment corrupts string.letters
Type: Stage:
Components: IDLE Versions: Python 2.4, Python 2.3, Python 2.5
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: georg.brandl, loewis, rupole
Priority: normal Keywords:

Created on 2008-06-30 00:10 by rupole, last changed 2008-07-01 20:37 by loewis. This issue is now closed.

Messages (6)
msg68977 - (view) Author: Roger Upole (rupole) Date: 2008-06-30 00:10
The problem seems to stem from this line in IOBinding.py:
locale.setlocale(locale.LC_CTYPE, "")

From the command prompt:
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit 
(Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import string, locale
>>> print repr(string.letters)
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> locale.setlocale(locale.LC_CTYPE, "")
'English_United States.1252'
>>> print repr(string.letters)
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\x83
\x8a\x8c\x8e\x9a\x9c\x9
e\x9f\xaa\xb5\xba\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9
\xca\xcb\xcc\xcd\xce\xc
f\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1
\xe2\xe3\xe
4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5
\xf6\xf8\xf
9\xfa\xfb\xfc\xfd\xfe\xff'
>>>
msg68984 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-06-30 01:15
Why do you think string.letters gets corrupted? AFAICT, it's still correct.
msg68999 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-06-30 08:26
Changing the locale changes string.letters -- that is expected behavior.
msg69063 - (view) Author: Roger Upole (rupole) Date: 2008-07-01 20:06
It introduces high characters that cause comparisons to fail under IDLE 
that succeed from the normal python prompt:

Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit 
(Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import string
>>> u'a' in string.letters
True


IDLE 1.2.2      
>>> import string
>>> u'a' in string.letters

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    u'a' in string.letters
UnicodeDecodeError: 'ascii' codec can't decode byte 0x83 in position 
52: ordinal not in range(128)

Or am I misunderstanding how the locale works with string comparisons ?
msg69066 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2008-07-01 20:15
Well, that wouldn't be different if you had set the locale in your
prompt. In short, ``u'a' in string.letters`` can never work with any
string.letters except the default, English-only one, and therefore is wrong.
msg69071 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-07-01 20:37
As Georg says: you shouldn't be mixing Unicode objects and string
objects. It's perfectly valid for string.letters to contain non-ASCII
bytes, and it's no surprise that this fails for you. string.letters
indeed *does* contain only letters.

In any case, testing for letter-ness by using "in string.letters" is not
a good idea, as it involves a linear search. I recommend to use

u"a".isalpha() 

instead
History
Date User Action Args
2008-07-01 20:37:44loewissetstatus: pending -> closed
messages: + msg69071
2008-07-01 20:15:25georg.brandlsetmessages: + msg69066
2008-07-01 20:06:56rupolesetmessages: + msg69063
2008-06-30 08:26:00georg.brandlsetstatus: open -> pending
nosy: + georg.brandl
resolution: wont fix
messages: + msg68999
2008-06-30 01:15:13loewissetnosy: + loewis
messages: + msg68984
2008-06-30 00:10:19rupolecreate