Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IDLE environment corrupts string.letters #47490

Closed
rupole mannequin opened this issue Jun 30, 2008 · 6 comments
Closed

IDLE environment corrupts string.letters #47490

rupole mannequin opened this issue Jun 30, 2008 · 6 comments

Comments

@rupole
Copy link
Mannequin

rupole mannequin commented Jun 30, 2008

BPO 3240
Nosy @loewis, @birkenfeld

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2008-07-01.20:37:44.056>
created_at = <Date 2008-06-30.00:10:18.840>
labels = ['expert-IDLE']
title = 'IDLE environment corrupts string.letters'
updated_at = <Date 2008-07-01.20:37:44.055>
user = 'https://bugs.python.org/rupole'

bugs.python.org fields:

activity = <Date 2008-07-01.20:37:44.055>
actor = 'loewis'
assignee = 'none'
closed = True
closed_date = <Date 2008-07-01.20:37:44.056>
closer = 'loewis'
components = ['IDLE']
creation = <Date 2008-06-30.00:10:18.840>
creator = 'rupole'
dependencies = []
files = []
hgrepos = []
issue_num = 3240
keywords = []
message_count = 6.0
messages = ['68977', '68984', '68999', '69063', '69066', '69071']
nosy_count = 3.0
nosy_names = ['loewis', 'georg.brandl', 'rupole']
pr_nums = []
priority = 'normal'
resolution = 'wont fix'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue3240'
versions = ['Python 2.5', 'Python 2.4', 'Python 2.3']

@rupole
Copy link
Mannequin Author

rupole mannequin commented Jun 30, 2008

The problem seems to stem from this line in IOBinding.py:
locale.setlocale(locale.LC_CTYPE, "")

From the command prompt:
Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit 
(Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import string, locale
>>> print repr(string.letters)
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> locale.setlocale(locale.LC_CTYPE, "")
'English_United States.1252'
>>> print repr(string.letters)
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\x83
\x8a\x8c\x8e\x9a\x9c\x9
e\x9f\xaa\xb5\xba\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9
\xca\xcb\xcc\xcd\xce\xc
f\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1
\xe2\xe3\xe
4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5
\xf6\xf8\xf
9\xfa\xfb\xfc\xfd\xfe\xff'
>>>

@rupole rupole mannequin added the topic-IDLE label Jun 30, 2008
@loewis
Copy link
Mannequin

loewis mannequin commented Jun 30, 2008

Why do you think string.letters gets corrupted? AFAICT, it's still correct.

@birkenfeld
Copy link
Member

Changing the locale changes string.letters -- that is expected behavior.

@rupole
Copy link
Mannequin Author

rupole mannequin commented Jul 1, 2008

It introduces high characters that cause comparisons to fail under IDLE
that succeed from the normal python prompt:

Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit 
(Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import string
>>> u'a' in string.letters
True


IDLE 1.2.2      
>>> import string
>>> u'a' in string.letters

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    u'a' in string.letters
UnicodeDecodeError: 'ascii' codec can't decode byte 0x83 in position 
52: ordinal not in range(128)

Or am I misunderstanding how the locale works with string comparisons ?

@birkenfeld
Copy link
Member

Well, that wouldn't be different if you had set the locale in your
prompt. In short, u'a' in string.letters can never work with any
string.letters except the default, English-only one, and therefore is wrong.

@loewis
Copy link
Mannequin

loewis mannequin commented Jul 1, 2008

As Georg says: you shouldn't be mixing Unicode objects and string
objects. It's perfectly valid for string.letters to contain non-ASCII
bytes, and it's no surprise that this fails for you. string.letters
indeed *does* contain only letters.

In any case, testing for letter-ness by using "in string.letters" is not
a good idea, as it involves a linear search. I recommend to use

u"a".isalpha()

instead

@loewis loewis mannequin closed this as completed Jul 1, 2008
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant