classification
Title: 'ascii' shoud be range(256) or just set another default encoding
Type: behavior Stage:
Components: Interpreter Core Versions: Python 2.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: electronixtar, pitrou
Priority: normal Keywords:

Created on 2008-08-22 14:14 by electronixtar, last changed 2008-08-22 15:35 by pitrou. This issue is now closed.

Messages (2)
msg71747 - (view) Author: (electronixtar) Date: 2008-08-22 14:14
One of the MOST common scene for Python developers on CJK/Widecharacter 
is this error:

'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range
(128)


Why cann't Python just define ascii to range(256), or alternatively, 
just use another default encoding that handles 0x00 to 0xff. Sometimes 
encoding problems in Python are driving me mad.

Currently I am using mbcs, but it's only available on Windows.
msg71753 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-22 15:35
> Sometimes 
> encoding problems in Python are driving me mad.

The thing is, they are not "encoding problems in Python", they are
encoding problems in the outside world. Python cannot know magically
which encoding is used in third-party data, so you have to tell it
yourself what the encoding is.

The default is "ascii" because only ASCII chars (from 32 to 127) can be
interpreted properly in most situations without having any knowledge of
the encoding (barring obsolete stuff such as EBCDIC, that is).

The only solution is to know what encoding you are expecting and
decode/encode it yourself. Python can't decide it for you.
History
Date User Action Args
2008-08-22 15:35:18pitrousetstatus: open -> closed
resolution: not a bug
messages: + msg71753
nosy: + pitrou
2008-08-22 14:14:09electronixtarcreate