Title: UnicodeDecodeError thrown for 'encode' operation on string
Type: behavior Stage: resolved
Components: Unicode Versions: Python 2.7
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Wojtek.Szymilowski, ezio.melotti, vstinner
Priority: normal Keywords:

Created on 2012-11-11 22:51 by Wojtek.Szymilowski, last changed 2012-11-17 16:58 by ezio.melotti. This issue is now closed.

Messages (2)
msg175404 - (view) Author: Wojtek Szymilowski (Wojtek.Szymilowski) Date: 2012-11-11 22:51
UnicodeDecodeError exception is reported for encode operation on strings.
This issue does not surface for the same operation on unicode string (UnicodeEncodeError exception is correctly reported).

---- string:
>>> 'AB\xff'.encode('ascii')

Traceback (most recent call last):
  File "<pyshell#27>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 2: ordinal not in range(128)

---- unicode string:
>>> u'AB\xff'.encode('ascii')

Traceback (most recent call last):
  File "<pyshell#28>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xff' in position 2: ordinal not in range(128)
msg175411 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012-11-11 23:57
'AB\xff'.encode('ascii') raises a UnicodeDecodeError because the byte string 'AB\xff' is decoded from the default encoding (sys.getdefaultencoding(), which is 'ASCII' in most cases), before  the .encode() method is called.

This is not a Python bug, but it is surprising. You should try Python 3 which does not have implicit conversion from/to bytes/unicode.
Date User Action Args
2012-11-17 16:58:40ezio.melottisetstage: resolved
2012-11-11 23:57:40vstinnersetstatus: open -> closed

nosy: + vstinner
messages: + msg175411

resolution: not a bug
2012-11-11 22:51:36Wojtek.Szymilowskicreate