Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency between string.letters and default encoding. #47752

Closed
ramong mannequin opened this issue Aug 5, 2008 · 2 comments
Closed

Inconsistency between string.letters and default encoding. #47752

ramong mannequin opened this issue Aug 5, 2008 · 2 comments

Comments

@ramong
Copy link
Mannequin

ramong mannequin commented Aug 5, 2008

BPO 3502
Nosy @loewis

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2008-08-05.05:12:47.282>
created_at = <Date 2008-08-05.00:33:52.439>
labels = ['OS-windows']
title = 'Inconsistency between string.letters and default encoding.'
updated_at = <Date 2008-08-05.05:12:47.251>
user = 'https://bugs.python.org/ramong'

bugs.python.org fields:

activity = <Date 2008-08-05.05:12:47.251>
actor = 'loewis'
assignee = 'none'
closed = True
closed_date = <Date 2008-08-05.05:12:47.282>
closer = 'loewis'
components = ['Windows']
creation = <Date 2008-08-05.00:33:52.439>
creator = 'ramong'
dependencies = []
files = []
hgrepos = []
issue_num = 3502
keywords = []
message_count = 2.0
messages = ['70725', '70730']
nosy_count = 2.0
nosy_names = ['loewis', 'ramong']
pr_nums = []
priority = 'normal'
resolution = 'wont fix'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue3502'
versions = ['Python 2.5']

@ramong
Copy link
Mannequin Author

ramong mannequin commented Aug 5, 2008

In python on Windows, under Idle, the string.letters includes extended
characters. But the default codec, used when translating from string to
unicode, is still ascii. This behaviour causes crashes with python win32
extensions.

>> string.letters

'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\x83\x8a\x8c\x8e\x9a\x9c\x9e\x9f\xaa\xb5\xba\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'

But still, unless the user customizes the installation,
sys.getdefaultencoding() returns ascii.

The consequence is that after instating a COM object, pywin32 211 issues
this exception:

File "C:\Python25\Lib\site-packages\win32com\client\build.py", line
297, in MakeFuncMethod
return self.MakeDispatchFuncMethod(entry, name, bMakeClass)
File "C:\Python25\Lib\site-packages\win32com\client\build.py", line
318, in MakeDispatchFuncMethod
s = linePrefix + 'def ' + name + '(self' + BuildCallList(fdesc,
names, defNamedOptArg, defNamedNotOptArg, defUnnamedArg, defOutArg) + '):'
File "C:\Python25\Lib\site-packages\win32com\client\build.py", line
604, in BuildCallList
argName = MakePublicAttributeName(argName)
File "C:\Python25\Lib\site-packages\win32com\client\build.py", line
542, in MakePublicAttributeName
return filter( lambda char: char in valid_identifier_chars, className)
File "C:\Python25\Lib\site-packages\win32com\client\build.py", line
542, in <lambda>
return filter( lambda char: char in valid_identifier_chars, className)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x83 in position 52:
ordinal not in range(128)

The line that causes this exception is from win32com.client.build.

This fragment is enough to reproduce the bug (from build.py in
win32com/client):

valid_identifier_chars = string.letters + string.digits + "_"
...
return filter( lambda char: char in valid_identifier_chars, className)

Try to print the expression in the return statement and set className to
anything you wish in Unicode. It will crash

It is contradictory that the default codec does not allow translation of
characters 0x83, and that string.letters includes it. If one regards
this character as printable, then it should be encoded successfully.

@ramong ramong mannequin added the OS-windows label Aug 5, 2008
@loewis
Copy link
Mannequin

loewis mannequin commented Aug 5, 2008

That's a bug in the Win32 extensions. They shouldn't use string.letters,
but string.ascii_letters, in particular when they check for valid
identifier chars.

Closing this report as "won't fix".

@loewis loewis mannequin closed this as completed Aug 5, 2008
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

0 participants