This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: strftime("%B") returns a String unusable with unicode
Type: behavior Stage:
Components: Versions: Python 2.5
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, ezio.melotti, t.steinruecken
Priority: normal Keywords:

Created on 2009-03-01 12:40 by t.steinruecken, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (3)
msg82958 - (view) Author: (t.steinruecken) Date: 2009-03-01 12:39
import locale
import datetime

locale.setlocale(locale.LC_ALL, ('de_DE', 'UTF8'))
print u""+datetime.datetime( 2009, 3, 1 ).strftime("%B")
--------------------------------------
Traceback (most recent call last):
    print u""+datetime.datetime( 2009, 3, 1 ).strftime("%B")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1:
ordinal not in range(128)
msg82960 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2009-03-01 13:00
I don't have the de_DE locale to reproduce that, but the cause is most
likely this:
1) datetime( 2009, 3, 1 ).strftime("%B") should return märz as a UTF-8
encoded string, i.e. 'm\xc3\xa4rz'
2) when you mix Unicode and encoded strings, the encoded strings are
automagically decoded to Unicode using the default codec, i.e. ASCII (on
Py2)
3) The ASCII codec is not able to decode '\xc3' (its value is 195, and
195 > 127) and a UnicodeDecodeError is raised.

The solution is to decode the string explicitly using UTF-8:
>>> month = 'm\xc3\xa4rz'
>>> u'' + month
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1:
ordinal not in range(128)
>>> u'' + month.decode('utf-8')
u'm\xe4rz'
>>>
msg82961 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-03-01 13:58
Ezio is correct: in general a string cannot be added to a unicode.
Except for the simplest case (only 7bit ascii characters), you have to 
decode the string:

u"" + datetime.datetime( 2009, 3, 1 ).strftime("%B").decode('utf-8')
History
Date User Action Args
2022-04-11 14:56:46adminsetgithub: 49648
2009-03-01 13:58:16amaury.forgeotdarcsetstatus: open -> closed
nosy: + amaury.forgeotdarc
resolution: not a bug
messages: + msg82961
2009-03-01 13:00:46ezio.melottisetnosy: + ezio.melotti
messages: + msg82960
2009-03-01 12:40:00t.steinrueckencreate